|
|
|
|
|
by zzzcpan
2166 days ago
|
|
Honestly, you are not making any sense. This is not how engineering works. If you design for resilience, you get more resilience and you build confidence as you see the evidence how the system works in real world. Furthermore, with resilience you have to always cover all risks, it's just that you don't immediately reach fine granularity of decisions that don't trigger failover to servers in different countries, you improve granularity as you learn from actual operations and modify your designs accordingly. I remember when I first deployed DNS routed system it was too reactive, constantly jumping between servers, monitoring was too sensitive, it didn't wait for servers to stabilize to return them into the mix and exponential backoff was taking servers out for far too long. But even given all that it was still able to avoid outages caused by data center failures and connectivity problems. |
|
> If you design for resilience, you get more resilience and you build confidence as you see the evidence how the system works in real world.
You simply can't foresee or eliminate all risk. This is referred to as "the turkey problem." It's not my idea, but one I certainly subscribe to.
https://www.convexresearch.com.br/en/insights/the-turkey-pro...