Hacker News new | ask | show | jobs
by dcurtis 1549 days ago
Simply, and without too many details: the load balancer failed to work properly when one server in the cluster stopped responding, causing a cascade of errors which successively crashed all of the other servers (including, interestingly, the RDS database server -- which even Amazon was unable to explain).