|
|
|
|
|
by dns_snek
850 days ago
|
|
How do you square that with the fact that shit usually hits the fan precisely because of this complexity, not in spite of it? That's my observation & experience, anyway. Added bits of "resiliency" often add brand new, unexplored failure points that are just ticking time bombs waiting to bring the entire system down. |
|
I can tell you 100% that eventually a disk will fail. I can tell you 100% that eventually the power will go out. I can tell you 100% that even if you have a computer with redundant power supplies each connected to separate grids, eventually both power supplies will fail at the same time - it just will happen a lot less often than if you have a regular computer not on any redundant/backup power. I can tell you that network cables do break from time to time. I can tell you that buildings are vulnerable to earthquakes, fires, floods, tornadoes and other such disasters). I can tell you that software is not perfect and eventually crashes. I can tell you that upgrades are hard if any protocol changed. I can tell you there is a long list of other known disasters that I didn't list, but a little research will discover.
I could look up the odds of the above. In turn this allows calculating the costs of each mitigation against the likely cost of not mitigating it - but this is only statistical you may decide something statistically cannot happen and it does anyway.
What I cannot tell you is how much you should mitigate. There is a cost to each mitigation that need to be compared to the value.