Hacker News new | ask | show | jobs
by mike_hearn 1490 days ago
How do you know they are obviously fake? That's the point the CEO is making. From the outside you can't always know. A lot of accounts that get fingered as 'obviously' fake aren't, in reality.

Edit: I guess I'll add some background here. I worked on anti-spam for a few years at Google and basically agree with everything Parag is saying here (although we'd disagree about many other things!). What he's laying out is spam fighting on social networks 101. Stuff that outsiders often latch onto, like user name patterns, are ~useless. You need a whole lot of signals many of which simply cannot be replicated by outsiders to detect spam. For example, one set of signals that Google was largely ignoring when I first joined the team was protocol deviations. We built an infrastructure to force and detect them, which was effective and is still in use.

For some years now I've been writing about the plague of academic "research" into Twitter bots that use completely invalid methodologies to try and detect spam accounts. There are over 11,000 published papers on this topic, which is absurd because very close to none of them are sound.

https://blog.plan99.net/fake-science-part-ii-bots-that-are-n...

https://blog.plan99.net/did-russian-bots-impact-brexit-ad66f...

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3814191

Tech talk for those who prefer video:

https://archive.org/details/hopeconf2020/20200726_2000_Peopl...

I'm pleased to see that the idiotic 20% claim is getting shredded on another HN thread, perhaps now people are waking up to the weakness of these sorts of claims more authors will stand up and publicly debunk them. As far as I can tell only myself, Florian Gallwitz and Michael Kreil have been pointing out the problems with third party Twitter spambot investigations in recent years.

Although I'm definitely a Muskian on free speech topics, I still feel a lot of sympathy for the spam fighters working at Twitter. The extent to which their work is second guessed is astounding, the fact that the second guessers are often peddling pseudoscience with institutional credentials just makes it worse.

1 comments

Yeah. For a community that constantly berates major tech companies for FPs in policy enforcement, it is baffling to see people say that various users are just obviously spam.
False positives are expected. It's another cost to handle them gracefully so that the people you affect are affected in the least bad way, and can get back up and running quickly.
I would expect this post to not go over well on another thread about an incorrect policy action. It is definitely not the norm that this community expects false positives.
Right? If there’s one thing we should know in this industry is that unless we have all the facts , we know nothing , and even when we do, we still know nothing.