Hacker News new | ask | show | jobs
by sesm 944 days ago
Summary of the article: use exponential backoff + jitter for retry intervals.

What author didn’t mention: sometimes you want to add jitter to delay the first request too, if the request happens immediately after some event from server (like server waking up). If you don’t do this, you may crash the server, and if your exponential backoff counter is not global you can even put server into cyclic restart.

2 comments

If you can crash the server with an improperly timed request, then you have a much bigger problem than client-side stuff.
I think what they mean is something that would cause client to do something at the same time (could be all sorts, some synchronised crash, aligning timers to clock-time, etc.). If the requests aren't user-driven then yes, you likely would want to include some jitter in the first request too.

Funnily, you'll notice that some of the visualisations have the clients staggering their first request. It's exactly for this reason. I wanted the visualisations to be as deterministic as possible while still feeling somewhat realistic. This staggering was a bit of a compromise.

Not sure what is meant by "if your exponential backoff counter is not global", though. Would love to know more about that.

True, but you can imagine something like a websocket to all clients getting reset and everyone re-connecting, re-authenticating, and getting a new payload.
One example is if a datacenter loses power and then all the hosts get turned on at the same time they can all send requests at the same time and crash a server.
Yes. Worst that should happen is getting a 404 or something. A crash due to requesting a piece of data that has not yet been created is poor design.
Yup, classic Thundering Herd Problem