Hacker News new | ask | show | jobs
by justsomeadvice0 1187 days ago
If you are split-cloud under a homogenous puppet master without homogeneous break-glass SSH access (which would be crazy) then probably your best bet is to just re-kick the world. But the scaling factor for this sort of thing is most certainly not team size; it's "how many X servers can be down at the same time", which will increase with your number of servers. In any case I think the FTC is the least of twitter's concerns right now.
1 comments

Not sure if it's still the case but last time I had co-located servers you could access the systems via OOB without needing to reboot them in single user mode.

If it's not the case then Twitter is definitely in far more trouble because according to past engineers at least a few of their services needed manual intervention on a full scale reboot. And losing quorum in a distributed system is never pretty.