Hacker News new | ask | show | jobs
by kazinator 1570 days ago
> There's our "No space" error getting reported by the OS, but no matter, the program silently swallows it and returns 0, the code for success. That's a bug!

Bzzt, no. You can't say that without knowing what the program's requirements are.

Blindly "fixing" a program to indicate failure due to not being able to write to standard output could break something.

Maybe the output is just a diagnostic that's not important, but some other program will reacts to the failed status, causing an issue.

Also, if a program produces output with a well-defined syntax, then the termination status may be superfluous; the truncation of the output can be detected by virtue of that syntax being incomplete.

E.g. JSON hello world fragment:

   puts("{\"hello\":\"world\"}");
   return 0;
if something is picking up the output and parsing it as JSON, it can deduce from a failed parse that the program didn't complete, rather than going by termination status.
4 comments

> if something is picking up the output and parsing it as JSON, it can deduce from a failed parse that the program didn't complete, rather than going by termination status.

This is bad advice. Consider output that might be truncated but can't be detected (mentioned in the article).

The exit status is the only reliable way to detect failures (unless you have a separate communication channel and send a final success message).

My remark "if a program produces output with a well-defined syntax" was intended specifically to consider such cases, and set them aside.

I didn't communicate that clearly: syntax can be "well-defined" yet truncatable. I meant some kind of syntax that is invalid if any suffix is missing, including the entire message, or else an object of an unexpected type is produced.

(In the case of JSON, valid JSON could be output which is truncatable, like 3.14 versus 3.14159. If the output is documented and expected to be a dictionary, we declare failure if a number emerges.)

When dealing with errors of integrating 100 different programs in a script, I don’t want to set aside special cases.

It should always behave the same. The exit code of a program is the agreed upon standard for this.

> Also, if a program produces output with a well-defined syntax, then the termination status may be superfluous; the truncation of the output can be detected by virtue of that syntax being incomplete.

The author covers this (or rather, the possibility that truncation can not be detected).

There is more nuance to this, which is that we cannot detect all modes of failure just because we have written to a file object, and successfully flushed and closed it.

In the case of file I/O, we do not know that the bits have actually gone to the storage device. A military-grade hello world has to perform a fsync. I think that also requires the right storage hardware to be entirely reliable.

If stdout happens to be a TCP socket, then all we know from a successful flush and close is that the data has gone into the network stack, not that the other side has received it. We need an end-to-end application level ack. (Even just a two-way orderly shutdown: after writing hello, half-close the socket. Then read from it until EOF. If the read fails, the connection was broken and it cannot be assumed that the hello had been received.)

This issue is just a facet of a more general problem: if the goal of the hello world program is to communicate its message to some destination, the only way to be sure is to obtain an acknowledgement from that destination: communication must be validated end-to-end, in other words. If you rely on any success signal of an intermediate agent, you don't have end-to-end validation of success.

The super-robust requirements for hello world therefore call for a protocol: something like this:

    puts("Hello, world!");
    puts("message received OK? [y/n]")
    return (fgets(buffer, sizeof buffer, stdin) != NULL && buffer[0] == 'y')
            ? EXIT_SUCCESS : EXIT_FAILURE;
Now we can detect failures like that there is no user present at the console who is reading the message. Or that their monitor isn't working so the can't read the question.

We can now correctly detect this case of not being able to deliver hello, world, converting it to a failed status:

  $ ./hello < /dev/null > /dev/null
We can still be lied to, but there is strong justification in regarding that as not our problem:

  $ yes | ./hello > /dev/null
We cannot get away from requiring syntax, because the presence of a protocol gives rise to it; the destination has to be able to tell somehow when it has received all of the data, so it can acknowledge it.

A super reliable hello world also must not take data integrity for granted; the message should include some kind of checksum to reduce the likelihood of corrupt communication going undetected.

> You can't say that without knowing what the program's requirements are.

The "program's requirements" can in theory be "to be buggy unusable piece of shit". But when we speak, we don't need to consider that use case.