Hacker News new | ask | show | jobs
by sneak 1889 days ago
> But, does it really matter?

I think an important consideration here is that a huge amount of time, money, and resources is spent on making sure the computers stay powered and cooled in all manner of situations. We contract redundant diesel delivery for generators, we buy and install gigantic diesel generator systems which are used for just minutes per year, huge automatic grid transfer switches, redundant fiber optic loops, dynamic routing protocols, N+1 this and double-redundant that. It's tremendously expensive in terms of money, human time, and physical/natural resources.

The point is that we are always striving to plan for failures, and engineering them out. When there is a real life actual outage, it means, necessarily, based on the huge amount of time and money and resources invested in planning around disaster/failure resilience, that the plan has a bug or an error.

Somebody had a responsibility (be it planning, engineering, or otherwise) that was not appropriately fulfilled.

Sure, they'll find it, and update their plan, and be able to respond better in the future - but the fundamental idea is that millions (billions?) have been spent in advance to prevent this from happening. That's not nothing.