|
|
|
|
|
by harikb
2379 days ago
|
|
The most famous s3 outage has been operator error from a well-meaning privileged user. The fact that it hasn’t happened for Lambda is just betting on luck. Shit happens, we can’t go designing ever more complicated solutions. May be our services should have some graceful degradation when shit happens instead of trying to create a big-bang and spawn an alternate universe. |
|
Cellular Architecture was largely a reaction to the S3 outage [0]. I agree that one is still bound to fail due to unknown unknowns or unpatchable known unknowns, but reducing the blast radius [1] to not be globally unavailable [2] is a step in the right direction.
[0] https://www.youtube-nocookie.com/embed/swQbA4zub20
[1] https://blog.acolyer.org/2016/09/12/on-designing-and-deployi...
[2] https://blog.acolyer.org/2015/05/07/large-scale-cluster-mana...