Hacker News new | ask | show | jobs
by ravedave5 2694 days ago
I got bit by this! Even worse is that one of the servers crumpled because we didn't scale up fast enough - so AWS killed it because of the health metric. Which then took out the remaining two because they were then far, far over capacity. I got the pager duty alert and found a total cluster and just manually set it to scale up way bigger. Now for all big events we manually bump minimum server counts for that period :\