Hacker News new | ask | show | jobs
by ycombinatorMan 3563 days ago
Out of curiosity, why are you deploying to all your web servers simultaneously? Could you not do a partial roll-out to make sure something like this doesnt happen?
1 comments

I doubt partial roll out would have helped in this particular case since it only happens in high load and they roll out new code twice a day.
Correct. We don't roll out during peak load either.
Considered at least starting your release canary during peak load?
We have talked about it. It is unlikely to helped with an event like this, and I don't recall an event where it would have. It also has the downside of extending our deployment cycle by a lot. Notably, we do run a canary internally, and that had no issues, which actually through us off for a while because while the app was partially down for users it was working for us and that hasn't happened to us in a while.