|
> is it my program’s responsibility to check what happens to the bytes AFTER the pipe? No, but it's not "after". Rather, it's your responsibility to handle backpressure by ensuring the bytes were written to the pipe successfully in the first place. This isn't just about the filesystem being full btw. If you imagine a command like ./foo.py | head -n 10, it only makes sense for the 'head' command to close the pipe when it's done, and foo.py should be able to detect this and stop printing any more output. (This is especially important if you consider that foo.py might produce infinite lines of output, like the 'yes' program.) I would argue this is not necessarily even an error from a user standpoint, so the return code from food.py should still be zero in many cases—a pipe-is-closed error just means the consumer simply didn't want the rest of the output, which is fine [1], whereas an out-of-disk-space error is probably really an error. Handling these robustly is actually difficult though, because (a) you'd need to figure out why printf() failed (so that you can treat different failures differently—but it's painful), and (b) you need to make sure any side effects in the program flow up to the printf() are semantically correct "prefixes" of the overall side effect, meaning that you'd need to pay careful attention to where you printf(). (Practically speaking, this makes it difficult to even have side effects that respect this, but that's an inherent problem/limitation of the pipeline model...) FWIW, I would be very curious if anyone has formalized all of these nuances of the pipeline model and come up with a robust & systematic way to handle them. It seems like a complicated problem to me. To give just one example of a problem that I'm thinking of: should stderr and stdout behave the same way with respect to "pipe is closed"? e.g. should the program terminate if either is closed, or if both are closed? The answer is probably "it depends", but on what exactly? What if they're redirected externally? What if they're redirected internally? Is there a pattern you can follow to get it right most of the time? There's a lot of room for analysis of the issues that can come up, especially when you throw buffering/threading/etc. into the mix... [1] Or maybe it isn't. Maybe the output (say, some archive format like ZIP) has a footer that needs to be read first, and it would be corrupt otherwise. Or maybe that's fine anyway, because the consumer should already understand you're outputting a ZIP, and it's on them if they want partial output. As always, "it depends". But I think a premature stdout closure is usually best treated as not-an-error. |
The usual way of handling this is by not (explicitly) handling it. Writes to a closed pipe are special, they do not normally fail with a status that the program then all too often ignores, they result in a SIGPIPE signal that defaults to killing the process. Extra steps are needed to not kill the process. No other kind of write error gets this special treatment that I am aware of.