Hacker News new | ask | show | jobs
by dkuebric 4747 days ago
One thing worth noting about these rolling restarts that I didn't see in your post: if the new code isn't completely backwards-compatible, you can end up with bad states from having a mix of workers running. This negates a lot of the value of the rolling restart because it creates other failure modes.

For example, if you introduce a new ajax endpoint in the release, and a client hits a new worker generating a HTML page that calls it, but 90% of your gunicorn workers are still serving the old version of the app, 90% chance that you're going to 404 that request.

1 comments

That's definitely an issue and you have to weight the time it takes to work around that with how long you mind the service being unreachable.

A new AJAX endpoint is pretty simple to work around with 2 releases, one to add the endpoint and another to use it. Most changes probably aren't that fortunate.