Hacker News new | ask | show | jobs
by joshka 944 days ago
For a lot of things, retry once and only once (at the outermost layer to avoid multiplicative amplification) is more correct. At a large enough scale, failing twice is often significantly (like 90%+) correlated with the likelihood of failing a third time regardless of backoff / jitter. This means that the second retry only serves to add more load to an already failing service.
2 comments

Correct. It's also the case that human generated requests will lose their relevance within seconds, a quick retry is all it's worth. As for machine generated requests a dead letter queue would make more sense, poor engineered backend services would OOM and well-engineered would load shed, if the requests are queued on the application servers they are doomed to be lost anyway.
Retrying end-to-end instead of stepwise greatly reduces the reliability of a process with a reasonable number of steps.

That being said, processes should ideally be failing in ways which make it clear whether an error is retryable or not.