| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simondotau 1269 days ago
	At Twitter scale, arbitrary hard-coded rules are almost always useless; they make life more difficult for authentic posters while nefarious actors can work around them with varying degrees of triviality.

2 comments

qixxiq 1269 days ago

Honestly our findings from working at scale (Facebook, Google, Twitter, Discord, Reddit, ...) were actually the opposite.

With hand written (not arbitrary) rules it's easier to understand the intent of the attacker and build a system that they can't work around because we're blocking them at their source of income. Sure they can figure out how to post messages but unless they can include their link/payload/etc it's not worth their time.

Machine learning defences are definitely a part of what we did, but they're slower to respond to attacks and generally easier to work around.

link

simondotau 1269 days ago

As someone who has personally battled such adversaries, I call bullshit on that. People with a financial incentive to spam in a user discussion environment are able to change pretty much every letter of their message if necessary.

link

anigbrowl 1269 days ago

arbitrary hard-coded rules are almost always useless

I disagree; just pointed out how it's not hard to get pure spam by using the filtered stream rules. If I can reliably identify & filter for spam on my creaking desktop with limited compute power and technical/coding skills, I would be happy to operate a silicon backhoe for a modest fee.

link

simondotau 1269 days ago

I’m talking about specifically about systems the size of Twitter. Arbitrary hard-coded rules are absolutely useful for smaller systems. I run a smaller system and such rules are useful and effective.

link

anigbrowl 1269 days ago

These are Twitter's filtered stream rules. They're accessible via the API to select from the global feed in real time. I don't have access to the firehose, of course, but my understanding is that it's an outgrowth of their internal systems. They have their own query language to filter Tweet and user parameters, semantic entity recognition, URL's etc.

https://developer.twitter.com/en/docs/twitter-api/tweets/fil...

link