Hacker News new | ask | show | jobs
by shakna 31 days ago
"volatile" tells the compiler it is _not_ safe to optimise away any read or write, so it can't just optimise that section away at all.

> An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.

A compliant compiler is only free to optimise away, where it can determine there are no side-effects. But volatile in 5.1.2.3 has:

> Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects.

3 comments

Yes, but undefined behaviour is undefined behaviour, and that behaviour can legally be that the code is not emitted at all, volatile (or any other side effect) or not. (and compilers do reason about undefined behaviour when optimising, so this isn't necessarily a completely theoretical argument, though I don't know whether the in compiler's actual logic which of 'don't optimise volatile' or the 'do assume undefined behaviour is impossible and remove code that definitely invokes it' would 'win', or whether there's any current compiler that would flag this as unconditionally undefined behaviour in the first place).
Volatile wins.

GCC calls that out [0] - volatile means things in memory may not be what they appear to be, and that there are asynchronous things happening, so something that may not appear to be possible, may become so, because volatile is a side-effect.

So about the only optimisation allowed to happen, is combining multiple references.

Clang is similar:

> The compiler does not optimize out any accesses to variables declared volatile. The number of volatile reads and writes will be exactly as they appear in the C/C++ code, no more and no less and in the same order.

[0] https://www.gnu.org/software/c-intro-and-ref/manual/html_nod...

That's cool and all if you are writing GCC or Clang dialect C, but it doesn't change the fact that it is UB in the C standard.
This is all assuming that the code is not invoking undefined behaviour. If the code is invoking undefined behaviour, GCC and clang are both well within their rights to say 'none of the rest of our documentation applies' (and have historically done so on bug reports).
Sure it can. That code path has unconditional UB and thus it is not valid.
Only if there would be no side-effects. Which there are.
No this is irrelevant for making this decision
I've mentioned elsewhere the standards, and compilers as well, disagreeing with you here.

But feel free to run against the various compilers through godbolt. [0] They won't optimise the branch away. Access to a volatile, must be preserved, in the order that they exist. No optimisation, UB or otherwise, is allowed to impede that. Because an access is a side-effect.

[0] https://godbolt.org/z/85cGhq3Ta

Compilers not doing something is not a demonstration that they are not actually allowed to do that thing.
That they won’t is as most a courtesy to you but they are not required to do this.
> Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.

I quoted the C standard, first. Not compiler behaviour.

I showed where it requires the compiler not to optimise this.

How about, instead of one-line throwaway disagreements, you point out where they are permitted to do this, instead?

This looks like a long back and fourth, that can easily be solved by a minute or two on godbolt...
> that can easily be solved by a minute or two on godbolt...

Unfortunately it's not that simple when it comes to UB. If the snippet in question does in fact exhibit UB then there's no guarantee whatever Godbolt shows will generalize to other programs/versions/compilers/environments/etc.

That's very funny to me.

A) x is always removed.

B) no, it's never removed if volatile.

But neither person can prove what a compiler will actually do, despite claiming they'll always act a certain way given 5 lines of code.

Also, at behavioural edges what you'll see on Godbolt is compiler bugs. So you learn nothing about what should happen.

All popular modern C++ compilers have known bugs and while I'm sure there are C compilers with no known bugs that will be because nobody tested very hard.

I have watched a compiler flip between emitting the code I expected (despite it having UB), and emitting unexpected code after a minor update.

What you observe a compiler do when there's UB is not at all something you can rely on.

No, claim A is 'x may be removed by a conforming C compiler'. Whether any given version of a given compiler actually does so in any given circumstance is a different question (the answer being: probably not, because while this is undefined behaviour it's not likely something that is going to be flagged as such by a compiler's optimizer. Also, from some testing with GCC and forcing a null point dereference, it seems like volatile at least does win in that case with the current version of it x86, and it dutifully emits the null pointer dereference and then the 'ud2' instruction instead of the rest of that execution path).
I made the weaker claim that x can be removed. This is something I could prove with compiler output but I would have to find a compiler willing to make this optimization which is not something I can guarantee.
No, compilers will often choose to not optimize on UB.
When compiler decides something is UB aka "result of this code is not defined and could be any" it selects the most performant version of undefined behavior - doing nothing by optimizing code away.
The compiler is not free to remove accesses to something marked volatile - its defined as a side-effect.

Volatile means something else may be acting here. Something else may install anything into the register at any time - and every time you access.

The compiler is required to preserve the order of accesses. In almost every C compiler, today, there are almost no optimisations the moment a volatile is introduced, for this reason.

If code has undefined behavior, the entire execution path that leads to that UB has no assigned semantics in the C model. So there are no volatile accesses in this code according to the C abstract machine - the entire execution path is UB, so it can be assumed it doesn't happen at all.
> An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine

The execution path has unknown side effects, and so the execution path must be strictly followed. That's uh... The entire point of that section in the C standard. Its why volatile is called out, in the semantic model for the abstract machine.

Otherwise... Why call it out, at all? It must be strictly followed, not lazily, as in other areas of the standard.

Previously discussed here: https://news.ycombinator.com/item?id=33770277

UB supersedes volatile, once the compiler hits UB then all bets are off. Compilers can and do optimize out UB branches, which is almost never what you want... yet here we are.

From that thread: https://news.ycombinator.com/item?id=33770905

>> The moment you enter a compilation unit (assuming no link optimizations) with a state which at some point will run into undefined behavior all bets are of. [...] Yes, UB can "time travel"

> Close, but not quite. This is a common misconception in the reverse direction.

> Abstractly, what UB can do is performing the inverse of the preceding instructions, effectively making the abstract machine run in reverse. However, this is only equivalent to "time-traveling" until you get to the point of the last side effect (where "side effect" here refers to predefined operations in the standard that interact with the external world, such as I/O and volatile accesses), because only everything since that point can be optimized away under the as-if rule without altering the externally visible effects of the program.

> As a concrete, practical example, this means the following: if you do fflush(stdout); return INT_MAX + 1; the compiler cannot omit the fflush() call merely because the subsequent statement had undefined behavior. That is, the UB cannot time-travel to before the flush. What the program can do is to write garbage to the file afterward, or attempt to overwrite what you wrote in the file to revert it to its previous state, but the fflush() must still occur before anything wild happens. If nobody observes the in-between state, then the end result can look like time-travel, but if the system blocks on fflush() and the user terminates the program while it's blocked, there is no opportunity for UB.

The print example has no defined order of accesses, function parameters can be evaluated in any order. But further, the entire problem with UB is that it supercedes the regular guarantees that you get (like with volatile) when it's encountered. Yes gcc and clang do the obvious thing that makes the most sense in this example, but what people are trying to tell you is that they could just not do that and they would still be complying with the standard. For example, you can imagine a more serious example of UB that causes the program to fail to compile completely, and then do you emit the correct number of in order reads of volatile variables? Obviously not.
Function parameters cannot be evaluated in any order, when one of them is a volatile.

> The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject

And what I am trying to tell people, is the standard has expectations around the volatile keyword, that the compilers took into account when designing how they would work - it isn't just kindness, its compliance. But no one is actually talking about the quotes from the standard, and just quoting themselves and their own understandings.

That quote doesn't have anything to do with parameter evaluation order. There is no order for function parameter evaluation.

And no, there is no exception for undefined behavior. There can't be, otherwise the behavior would be... defined. It's in the name. Again, what do you think the compiler emits when the undefined behavior causes the program to not compile altogether?