Hacker News new | ask | show | jobs
by prolepunk 2465 days ago
Anecdotal evidence.

I remember working for a company about a decade ago where we spent so much time engineering duplicate everything hardware or hot spares everywhere.

Guess what, when switch failed it didn't properly failed over. When router it started sending spurious packets everywhere and had to be taken down manually.

All the effort that went into duplicating hardware and making hot spares could have been save by just... having cold spares, and in the end the amount of downtime would have been the same, or less -- because when one thing fails it's really easy obvious where things stopped working.