Hacker News new | ask | show | jobs
by sbarbarian 1490 days ago
Musk has a point - the reality is that each and every Twitter user wades through obviously fake accounts every time they use the service.

this is a problem that hasn't been solved. all explanations around why that is are meaningless compared to real action / perceptible change.

4 comments

The conversation we should be having is where are we now, and what is good enough?

Having worked in large scale anti-abuse detection for most of my career (~18 years), the points mentioned line up in the Twitter thread align with my experience. Scaling in this area is hard. 99% efficacy sounds great, until you say 99% out of millions/billions. The amount of FNs ('bad' or unwanted things) is still substantial enough for users to notice. Taking a 229m active user count [1], 99% fake account detection efficacy sets you at 2.2m fake accounts. Looking into tweets/day you've similarly large numbers if you want to look at content detection.

Twitter can most likely do better given the right resources, people, and leadership support (Facebook has similar problems aligning all 3 of those). Once they have those, the open question is how much better they can get. Each incremental increase in efficacy gets more expensive.

To top it off, as detection gets better, you think those abusing Twitter will sit still? Of course not, they'll change tactics (content, usage of hacked accounts, etc.).

[1] Twitter 10-Q 2022-Q1 - https://www.sec.gov/ix?doc=/Archives/edgar/data/0001418091/0...

How do you know they are obviously fake? That's the point the CEO is making. From the outside you can't always know. A lot of accounts that get fingered as 'obviously' fake aren't, in reality.

Edit: I guess I'll add some background here. I worked on anti-spam for a few years at Google and basically agree with everything Parag is saying here (although we'd disagree about many other things!). What he's laying out is spam fighting on social networks 101. Stuff that outsiders often latch onto, like user name patterns, are ~useless. You need a whole lot of signals many of which simply cannot be replicated by outsiders to detect spam. For example, one set of signals that Google was largely ignoring when I first joined the team was protocol deviations. We built an infrastructure to force and detect them, which was effective and is still in use.

For some years now I've been writing about the plague of academic "research" into Twitter bots that use completely invalid methodologies to try and detect spam accounts. There are over 11,000 published papers on this topic, which is absurd because very close to none of them are sound.

https://blog.plan99.net/fake-science-part-ii-bots-that-are-n...

https://blog.plan99.net/did-russian-bots-impact-brexit-ad66f...

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3814191

Tech talk for those who prefer video:

https://archive.org/details/hopeconf2020/20200726_2000_Peopl...

I'm pleased to see that the idiotic 20% claim is getting shredded on another HN thread, perhaps now people are waking up to the weakness of these sorts of claims more authors will stand up and publicly debunk them. As far as I can tell only myself, Florian Gallwitz and Michael Kreil have been pointing out the problems with third party Twitter spambot investigations in recent years.

Although I'm definitely a Muskian on free speech topics, I still feel a lot of sympathy for the spam fighters working at Twitter. The extent to which their work is second guessed is astounding, the fact that the second guessers are often peddling pseudoscience with institutional credentials just makes it worse.

Yeah. For a community that constantly berates major tech companies for FPs in policy enforcement, it is baffling to see people say that various users are just obviously spam.
False positives are expected. It's another cost to handle them gracefully so that the people you affect are affected in the least bad way, and can get back up and running quickly.
I would expect this post to not go over well on another thread about an incorrect policy action. It is definitely not the norm that this community expects false positives.
Right? If there’s one thing we should know in this industry is that unless we have all the facts , we know nothing , and even when we do, we still know nothing.
In part this seems like a denominator issue. Many real Twitter users lurk, or post very seldom. Bots probably don’t lurk much - what would be the point?

If you took the likelihood of a tweet being spam, it’s going to be much higher than a user being spam, when “users” includes the lurkers you rarely see.

For the user experience, what you see is what matters and lurkers don’t count. But counting lurkers might matter more for ad impressions and revenue.

I wonder sometimes if Musk knows this stuff and enjoys playing dumb to troll people, or if he really is out of touch. Maybe focusing on the user experience is the right move regardless of how you get there?

Musk is the largest beneficiary of all the fake accounts pumping his follower count.

He basically lives on twitter so he knows everything perfectly. He should pay the 54$/share as per the contract he signed.

The stock market crashed and it keeps going down so he wants to negotiate a different price, that's all there is to it. Buyer's remorse due to deterioration of macroeconomic enviornment.

And the credulity of people believing a billionaire would come to save them because he's "not caring about economics at all"...

> Musk is the largest beneficiary of all the fake accounts pumping his follower count.

Fake accounts don't pump TSLA stock. They have no purchasing power. And he has a large enough following that it really doesn't matter what the number next to his name is, whether it's 30 million or 90 million.

But I agree with you that it sounds like he's got buyer's remorse. Seems like free speech (or whatever the public reason he stated for buying Twitter) isn't worth that much to him.

They might have purchasing power, how do you know?