Hacker News new | ask | show | jobs
by threeseed 1191 days ago
According to this [1] Twitter has 500,000+ servers spread across DCs, GCP and AWS.

Which if we assume only a team of your size remains then it would take 300+ days.

That would mean no OS patches etc which would put them firmly in the crosshairs of the FTC.

[1] https://twitter.com/d_feldman/status/1562265193249390593

2 comments

Interesting. Considering the number of MAU to be around 350 million, that's a bit fewer than 1000 persons per server. Of course it's not that simple, because not all servers are the same and more importantly not all users are the same, but it sounds like a bit on the low end.

Anecdotal point of data: infosec.exchange hosts 30k users on 7 servers (https://infosec.exchange/@jerry/109374478717918484). That's a 1:5 ratio. Again, not the same usage and performance requirements, but I find it interesting.

If you are split-cloud under a homogenous puppet master without homogeneous break-glass SSH access (which would be crazy) then probably your best bet is to just re-kick the world. But the scaling factor for this sort of thing is most certainly not team size; it's "how many X servers can be down at the same time", which will increase with your number of servers. In any case I think the FTC is the least of twitter's concerns right now.
Not sure if it's still the case but last time I had co-located servers you could access the systems via OOB without needing to reboot them in single user mode.

If it's not the case then Twitter is definitely in far more trouble because according to past engineers at least a few of their services needed manual intervention on a full scale reboot. And losing quorum in a distributed system is never pretty.