Hacker News new | ask | show | jobs
by mpweiher 3688 days ago
> especially with modern UB-aggressive optimizing compilers.

You put your finger on the problem: "modern UB-aggressive optimising compilers". C, the language, is actually quite simple (if not easy). The crazy stuff that compiler writers have been doing recently while aggressively mis-reading the C standard is the problem and does make things very complicated.

Why "misreading"?

From 1.1:

"The X3J11 charter clearly mandates the Committee to codify common existing practice."

Their emphasis, not mine. So is there a mandate to use the definitions of the standard to invalidate common existing practice? Clearly not. Yet that is what is happening.

More from the standard (defining UB):

"Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behaviour."

Does it say "Undefined behaviour gives implementors license to add new optimisations that break existing programs"? Clearly and unambiguously not.

See http://port70.net/~nsz/c/c89/rationale/a.html#1

1 comments

Your interpretation of "codify common existing practice" would imply that no new compiler optimizations could be implemented since 1990 (when the first version of the standard was published), as any optimization could potentially change the observable execution behavior of an erroneous program that contains UB.

> More from the standard (defining UB):

Your quote is not from the normative text of the standard, but from the non-normative rationale. Note however that it explicitly says that programs that contain undefined behaviors are erroneous, and that the implementation is not required to emit diagnostics for the UB. Pretty clearly this allows implementations to optimize erroneous programs into whatever they think is funny this week.

The normative text of the standard is pretty unambiguous:

    undefined behavior
    behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
    for which this International Standard imposes no requirements
http://www.iso-9899.info/n1570.html#3.4.3
> Your interpretation of "codify common existing practice" would imply that no new compiler optimizations could be implemented since 1990

Utter nonsense. I use that word carefully, but in this case it is absolutely appropriate.

Compiler optimisations per an old but very useful definition aren't allowed to change the visible behaviour of programs (in terms of output, obviously they are allowed to change execution times).

For example, even just a couple of years ago the compilers I used would execute a loop that sums the first n integers. Nowadays compilers detect this and replace the loop with the result. While this isn't particularly useful, because probably the only reason you're summing the first n integers in a loop is to do some measurements, it is (a) a perfectly legal optimisation and (b) happened after 1990.

Unsurprisingly, you left out the second part of the (later) definition:

   NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable
    results, to behaving during translation or program execution in a documented manner characteristic of the
    environment (with or without the issuance of a diagnostic message), to terminating a translation or
    execution (with the issuance of a diagnostic message).
Notably absent is "use the undefined behaviour to shave another 0.2% off my favourite benchmark".
> Unsurprisingly, you left out the second part of the (later) definition:

It is not part of the normative definition, which says "for which this International Standard imposes no requirements". In ISO standards, notes are without exception non-normative.

Although I think they really should add your proposed text as an additional example, as their current set of examples is evidently confusingly incomplete :-)

>Note however that it explicitly says that programs that contain undefined behaviors are erroneous

No it doesn't say that. It says that they are either "nonportable" or "erroneous". I'll take "nonportable" for 400, please.

As the "rationale" document points out, implementations are free to do something well-defined in the cases that the standard considers UB. For example, an implementation may document that it detects out-of-bounds array reads and these always return the value "0", and a hypothetical "C" program could rely on that. But implementations explicitly aren't required to do that, hence code that relies on a particular interpretation of UB in a particular implementation is nonportable, since it is a program written in an extended dialect of C, not ISO standard C.

Options like GCC's -fwrapv/-ftrapv and -fno-strict-aliasing are examples of language extensions that are essentially implementation defined UB.

Edit: Of course you could argue that things where hardware difference are a likely motivation such as signed integer overflow ought not to be UB in the first place, but instead left as implementation defined in the standard, but in that case your issue is with the C standard committee, not with implementers.