|
|
|
|
|
by darkmighty
2906 days ago
|
|
The key part of redundancy is that your "redundancy glue"[1] must be significantly more reliable than each component, including its software and implementation -- because often the glue failing in isolation itself can cause outages. So the probability of failure was simply P(single failure); now for 2x parallel redundant systems it is P(single failure)^2 + P(glue failure). If P(single failure)^2 ~ 0, we need P(glue failure) < P(single failure), at the very least. [1] i.e. the systems that interconnect the multiple redundant system, detect failures, redirect traffic, etc. |
|
Turtles all the way down, I guess.