Hacker News new | ask | show | jobs
by ant6n 718 days ago
> but that code is reordered or transformed in complicated ways is true even without UB.

Without undefined behavior, the compiler emits code that has the behavior defined by the code —- the ordering may be altered, but not the behavior.

1 comments

Yes, and with undefined behavior, the compiler has to emit code that has the behavior defined by the code up to the operation that has undefined behavior.
That is false. If a compiler determines that some statement has undefined behavior, it can treat it as unreachable, and, transitively, other code before it as unreachable.

  printf("hello\n"); // this doesn't have to print
  x = x / 0;         // because this is effectively a notreached() assertion
This is in direct contradiction to what uecker says. Can you back up your claim -- for both C and C++? Putting your code in godbolt with -O3 did not remove the print statement for me in either C or C++. But I didn't experiment with different compilers or compiler flags, or more complicated program constructions.

https://godbolt.org/z/8nbbd3jPW

I've often said that I've never noticed any surprising consequences from UB personally. I know I'm on thin ice here and running risk of looking very ignorant. There are a lot of blogposts and comments that spread what seems like FUD from my tiny personal lookout. It just seems hard to come across measureable evidence of actual miscompilations happening in the wild that show crazy unpredictable behaviour -- I would really like to have some of it to even be able to start tallying the practical impact.

And disregarding whatever formulations there are in the standard -- I think we can all agree that insofar compilers don't already do this, they should be fixed to reject programs with an error message should they be able to prove UB statically -- instead of silently producing something else or acting like the code wouldn't exist.

Is there an error in my logic -- is there a reason why this shouldn't be practically possible for compilers to do, just based on how UB is defined? With all the flaws that C has, UB seems like a relatively minor one to me in practice.

Another example: https://godbolt.org/z/b5j99enTn

This is an adaption from the Raymond Chen post, and it seems to actually compile to a "return 1" when compiling with C++ (not with C), at least with the settings I tried. And even the "return 1" for me is understandable given that we actually hit a bug and there are no observeable side-effects before the UB happens. (But again, the compiler should instead be so friendly and emit a diagnostic about what it's doing here, or better return an error).

Un-comment the printf statement and you'll see that the code totally changes. The printf actually happens now. So again, what uecker says about observable effects seems to apply.

In this [1] example GCC hoists, even in C mode, a potentially trapping division above a volatile store. If c=0 you get one less side effect than expected before UB (i.e. the division by zero trap). This is arguably a GCC bug if we agree on the new standard interpretation, but it does show that compilers do some unsafe time travelling transformations.

Hoisting the loop invariant div is an important optimization, but in this case I think the compiler could preserve both the optimization and the ordering of the side effects by loop-peeling.

[1] https://godbolt.org/z/ecsdrPa94

Thanks for the example. But again I can't see a problem. The compiler does not actually prove UB in this case, so I suppose this doesn't qualify as applying (mis-) optimizations silently based on UB. Or what did I miss?
Compilers don't prove UB; they assume absence of UB.

That, plus a modicum of reasoning like "if this were to be evaluated, it would be UB" (therefore, let's assume that is not evaluated).

The compiler is moving a potentially UB operation above a side effect. This contradicts uecker non-time-traveling-ub and it is potentially a GCC bug.

If you want an example of GCC removing a side effect that happens-before provable subsequent UB: https://godbolt.org/z/PfoT8E8PP but I don't find it terribly interesting as the compiler warns here.

The implementation can assume that the program does not perpetrate undefined behavior (other than undefined behavior which the implementation itself defines as a documented extension).

The only way the program can avoid perpetrating undefined behavior in the statement "x = x / 0" is if it does not execute that statement.

Thus, to assume that the program does not invoke undefined behavior is tantamount to assuming that the program does not execute "x = x / 0".

But "x = x / 0" follows printf("hello\n") unconditionally. If the printf is executed, then x = x / 0 will be executed. Therefore if the program does not invoke undefined behavior, it does not execute printf("hello\n") either.

If the program can be assumed not to execute printf("hello\n"), there is no need to generate code for it.

Look at the documentation for GCC's __builtin_unreachable:

> Built-in Function: void __builtin_unreachable (void)

> If control flow reaches the point of the __builtin_unreachable, the program is undefined. It is useful in situations where the compiler cannot deduce the unreachability of the code.

The unreachable code assertion works by invoking undefined behavior!

x/0 is not reached if the printf blocks forever, exits or return via an exceptional path (longjmp in C, exceptions in C++). Now specifically standard printf won't longjmp or exit (but glibc one can), but it still can block forever, so the compiler in practice can't hoist UB over opaque function calls.

edit: this is in addition to the guarantees with regard to side effects that uecker says the C standard provides.

But does `printf();` return to the caller unconditionally?

This is far from obvious -- especially once SIGPIPE comes into play, it's quite possible that printf will terminate the program and prevent the undefined behavior from occurring. Which means the compiler is not allowed to optimize it out.

`for(;;);` does not terminate; yet it can be removed if it precedes an unreachability assertion.

The only issue is that writing to a stream is visible behavior. I believe that it would still be okay to eliminate visible behavior if the program asserts that it's unreachable. The only reason you might not be able to coax the elimination out of compilers is that they are being careful around visible behavior. (Or, more weakly, around external function calls).

Yeah but do you have an actual instance of "time travel" happening? Without one the issue is merely theoretic discussion of how to understand or implement the standards. If you provide a real instance, the practical impact and possible remedies could be discussed.
Mmmh, how about

    #include <stdio.h>


    int f(int y, int a) {
        int x, z;
        printf("hello ");
        x = y / a;
        printf("world!");
        z = y / a;
        return x+y;
    }
In godbolt, it seems the compiler tends to combine the two printfs together. So if a=0, it leads to UB between the printfs, but that wont happen until after the two printfs. Here the UB is delayed. But will the compiler actually make sure that in some other case, the x/a won't be moved earlier somehow? Does the compiler take any potentially undefined behavior and force ordering constraints around them? ...The whole point of UB is to be able to optimize the code as if it doesn't have undefined behavior, so that we all get the maximum optimization and correct behavior as long as there's no UB in the code.