Yes, random blog posts did a lot of damage here. Also broken compilers [1]. Note that blog post is correct about C++ but incorrectly assumes this is true for C as well.
I'm inclined to trust Raymond Chen and John Regehr on these matters, so if you assert that they're incorrect here then a source to back up your assertion would help your argument.
I am a member of WG14. You should check the C standard. I do not see how "time-travel" is a possible reading of the definition of UB in C. We added another footnote to C23 to counter this idea:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf
"Any other behavior during execution of a program is only affected as a direct consequence of the concrete behavior that occurs when encountering the erroneous or non portable program construct or data. In particular, all observable behavior (5.1.2.4) appears as specified in this document when it happens before an operation with undefined behavior in the execution of the program."
I should point out that compilers also generally do not do true time-travel: Consider this example: https://godbolt.org/z/rPG14rrbj
So maybe we have different definitions of "time travel". But I recall that
- if a compiler finds that condition A would lead to UB, it can assume that A is never true
- that fact can "backpropagate" to, for example, eliminate comparisons long before the UB.
There may be different definitions, but also a lot of incorrect information. Nothing changes with C23 except that we added a note that clarifies that UB can not time-travel. The semantic model in C only requires that observable effects are preserved. Everything else can be changed by the optimizer as long as it does not change those observable effects (known as the "as if" principle). This is generally the basis of most optimizations. Thus, I call time-travel only when it would affect previous observable effects, and this what is allowed for UB in C++ but not in C. Earlier non-observable effects can be changed in any case and is nothing speicifc to UB. So if you call time-travel also certain optimization that do not affect earlier observable behavior, then this was and is still allowed. But the often repeated statement that a compiler can assume that "A is never true" does not follow (or only in very limited sense) from the definition of UB in ISO C (and never did), so one has to be more careful here. In particular it is not possible to remove I/O before UB. The following code has to print 0 when called with zero and a compiler which would remove the I/O would not be conforming.
int foo(int x)
{
printf("%d\n", x);
fflush(stdout);
return 1 / x;
}
In the following example
int foo(int x)
{
if (x) bar(x);
return 1 / x;
}
the compiler could indeed remove the "if" but not because it were allowed to assume that x can never be zero, but because 1 / 0 can have arbitrary behavior, so could also call "bar()" and then it is called for zero and non-zero x and the if condition could be removed (not that compilers would do this)
I think the clarification is good, probably the amount of optimizations that are prevented by treating volatile and atomics as UB barriers is limited, but as your example show, a lot of very surprising transformations are still allowed.
Unfortunately I don't think there is a good fix for that.
There is unconditional use of a pointer b, which is UB if b is null. However, there is an earlier branch that checks if b is null. If we expected the UB to "backpropagate", the compiler would eliminate that branch, but both gcc and clang at O3 keep the branch.
However, both gcc and clang have rearranged the side effects of that branch to become visible at the end of the function. I.e. if b is null, it's as if that initial branch never ran. You could observe the difference if you trapped SIGSEGV. So even though the compiler didn't attempt to "time-travel" the UB, in combination with other allowed optimizations (reordering memory accesses), it ended up with the same effect.
While interactions with volatile and interactive streams cannot time-travel, anything else is free to - the standard only imposes requirements on a conforming implementation in terms of the contents of files at program termination, and programs with undefined behaviour are not required to terminate, so there are approximately no requirements on a program that invokes undefined behaviour.
The issue you linked to is not a counter example because, as the poster said, g may terminate the program in which case that snippet does not have undefined behaviour even if b is zero. The fact that they bothered to mention that g may terminate the program seems like an acknowledgement that it would be valid to do that time travelling if it didn't.
> Note that blog post is correct about C++ but incorrectly assumes this is true for C as well.
Presumably you're referring to this line of the C++ standard, which does not appear in the C standard:
> However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).
I looked at every instance of the word "undefined" in the C standard and, granted, it definitely didn't have anything quite so clear about time travel as that. But it also didn't make any counter claims that operations before are valid. It pretty much just said that undefined behaviour causes behaviour that is undefined! So, without strong evidence, it seem presumptuous to assume that operations provably before undefined behaviour are well defined.
The poster is me. You are right that this is not an example for time-travel. There aren't really good examples for true time travel because compilers generally do not do this. But my point is that with compilers behaving like this, people might confuse this for time-traveling UB. I have certainly met some who did and the blog posts seems to have similar examples (but I haven't looked closely now).
I didn't notice that section when I last read through C23, but I'm very glad to see it. Reining in UB is one of the hardest problems I've had to deal with, and being able to say operations are defined up to the point of UB makes my job so much easier.
The lack of clarity in earlier standards made it impossible to deal with code incrementally, since all the unknown execution paths could potentially breach back in time and smash your semantics.
Ok, fair enough. I must admit I was looking at C99 as I thought that was most generally relevant, I don't follow recent C standards (as much as I do those for C++) and C23 hasn't been ratified yet. I've found your new snippet:
> In particular, all observable behavior (5.1.2.4) appears as specified in this document when it happens before an
operation with undefined behavior in the execution of the program.
I consider that a change in the standard but, of course, that's allowed, especially as it's backwards compatible for well defined programs.
The wording is a little odd: it makes it sound a like you need some undefined behaviour in order to make the operations beforehand work, and, taken very literally, that operations between two undefined behaviours will work (because they're still "before an operation with undefined behavior"). But I suppose the intention is clear.
The definition of UB (which hasn't changed) is: "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirement."
Note that the "for which" IMHO already makes this clear that this can not travel in time. When everything could be affected these words ("for which") would be meaningless.