Hacker News new | ask | show | jobs
by saurik 222 days ago
Yeah... it has felt kind of ridiculous over the years how many times I have tracked some but I was experiencing down to a timeout someone added in the code for a project I was working with, and I have come to the conclusion over the years that the fix is always to remove the timeout: the existence of a timeout is, inherently, a bug, not a feature, and if your design fundamentally relies on a timeout to function, then the design is also inherently flawed.
2 comments

How would you handle the case when some web service is making calls to a 3rd-party and that 3rd-party is failing in unexpected ways (i.e. under high load or IPs are not answering due to routing issues) to avoid a snowball effect on your service without using the timeout concept in any way?
You put the timeout on the socket, not your application. Your application shouldn't care how long it takes, as long as progress is being made, which the socket will know about, but you won't. If you put a timeout on your application and then retry, you'll just make the problem worse. Your original packets are still in a buffer somewhere and still will be processed. Retrying won't help the situation.
The socket should also not have a timeout.
Sockets actually need a timeout because there is no signal that a client has disconnected. Eventually, maybe, a router along the path will be nice enough to send you a RST packet, but it isn’t guaranteed.
People put a lot of timeouts in code when there are humans in the loop that should handle the timeout. An outgoing socket (as is the case in this scenario) really should not have a timeout.

An incoming one might could have a timeout if there is no other way to garbage collect the connection, but, if at all possible, that should usually be in the higher layers, not the lower ones.

(Maybe read my other response to the person you responded to? I purposefully gave you a really short and matter-of-fact statement that fit into the discussion from the thread more broadly.)

I’m explicitly saying not to put timeouts in code… but you must put a timeout on a socket due to the way they work. Period. Or deal with the default, which is usually many minutes. Sockets timeout when packets haven’t been acknowledged for a long time, but you can also set an idle timeout as well.

A timeout on sockets isn’t negotiable.

How does the timeout help? Expose the lack of progress to the user and give them a way to give up; if they choose to walk away, then you stop. The only timeout should be in the head of a human that can make real decisions about how long too long is. The real problem: me knowing that if the software would have waited a bit longer, it would have worked. Your timeouts just cause more busy work and are often the root cause of snowball effects.
My hypothetical pitch deck tile slide: setTimeout() on a vector clock. I can hear Lamport’s scream from here and I live far away