Hacker News new | ask | show | jobs
by dnautics 2760 days ago
Is that true? What if you have an error that happens once every 10^9 requests and the service failure isn't critical (no one dies). Isn't it better to just not bother with the error, let the service keep running and don't worry about it?
1 comments

That depends on whether you are going to be the person who has to find out why corrupted data has been written, when it was corrupted, what the original value was or even worse, whether a given record is corrupted or not because the corrupted ones look exactly the same as some value of non-corrupted ones.

As someone who has been in that boat, I can tell you that request termination due to an uncaught exception is infinity times better than the horror of debugging data corruption, even (or especially) when it only happens ever 10^9 requests.

If it's 10^9 requests, then obviously they are low-cost processes in the first place, so you're better off just re-doing the job that failed.
That's what I mean. Yes.

But when the caller missed checking some error code, then nothing won't notice it has failed and now you have corrupt data to deal with. That's why I think blowing up at the time when something goes wrong is so valuable.