Hacker News new | ask | show | jobs
by azlev 620 days ago
Good reading.

In my last job, the service mesh was responsible to do retries. It was a startup and the system was changing every day.

After a while, we suspect that some services were not reliable enough and retries were hiding this fact. Turning off retries exposed that in fact, quality went down.

In the end, we put retries in just some services.

I never tested neither retry budget nor deadline propagation. I will suggest this in the future.

1 comments

Why not just add telemetry to see when requests are retried?