Hacker News new | ask | show | jobs
by scumola 1468 days ago
I agree here. The firehose is 'downstream' from the client actually interacting with the API which is where all of the real tracking data would live. Giving Musk the firehose data is useless to him. Musk needs the user database, the web server logs and any historical data on any suspected bots that twitter already has collected over the years.

Twitter spam bots that offer to sell me bitcoin or start following me with a beautiful profile picture who just opened their twitter account and I don't know are pretty obviously bots to me. The former could be found using the firehose, but the latter wouldn't be detected until they tweeted something in the firehose.

If the firehose included DMs, new accounts (with all available signup and verification data), etc. then the firehose would be useful. Detecting bots using the firehose would only detect bots based on suspicious tweet content and nothing else.

Also, the firehose is a 7Mbps (compressed) stream. It takes substantial compute power to run just sentiment analysis and substantial storage to keep heuristics on that volume of data. Musk would need to spend some time and money and hire some people to just make sense of the full firehose.

Source: I worked for GNIP in 2012. GNIP used to re-sell the twitter firehose and was a twitter partner before they were acquired by twitter in 2014.