|
|
|
|
|
by james_cowling
3700 days ago
|
|
This is a really great question since the vast majority of the work was in ensuring correctness and reliability: everything from testing discipline to fault injection to auditing. This also included hardware testing, like pulling out circuit breakers to test our power distribution, or overheating a rack to test graceful shutdown. I'll give a slightly lazy answer here however and point you to a talk I gave about building durable systems, which covers a lot of this material:
https://www.oreilly.com/events/velocity/devops-web-performan... |
|