Hacker News new | ask | show | jobs
by justinsb 5915 days ago
I'm not going to treat a cosmic ray corrupting one single network packet the same way I treat a hurricane cutting off power to a datacenter for 2 weeks. I do see the intellectual appeal in doing so, but we'll just have to agree to disagree!
1 comments

Uhm, of course they're not the same thing... but they have the same effect. The point is that the system remains available even if a node becomes unavailable for _whatever_ reason. I'm not sure what you're disagreeing on... There are common and uncommon modes of failure. Of course we should prioritize handling the common ones. But if we can handle all of them at once that's ideal. And, as I said in an earlier comment, when you're doing a million operations a second, failures that are one-in-a-million happen every second.