Did you read TFA? The author did that (though using GCC), and the reason the optimizer does what you see is undefined behavior due to signed integer overflow.
Did you understand the comment? The author used GCC, and GCC is only able to vectorize the loop. But clang on the other hand, essentially turned this O(n) algorithm to calculate a particular sum into an O(1) result.
> the reason the optimizer does what you see is undefined behavior due to signed integer overflow
Yes undefined behavior gives the optimizer the right in this case to transform the code into anything, including a nonsense answer, or a trap instruction. But the optimizer did not; it produced the right answer under 2's complement arithmetic.
There is no undefined behaviour, '#pragma GCC optimize("wrapv")' takes care of that.
EDIT: It seems that clang doesn't support #pragma GCC optimize, so it's a no-op in that snippet. It doesn't change the result though. If you pass -fwrapv flag to clang, it will be optimized in exactly the same way.
Just to be clear, undefined behavior means the standard allows implementations to do what they they feel is the right thing to do under that scenario, and the outcome will still comply with the standard.
Undefined behaviour, in reality, means: the compiler will assume that it doesn’t happen, so whatever code path leading up to it can also (by definition) not happen and be eliminated. E.g. signed integer overflow “cannot happen” so you never need to emit code checking for it or dealing with it.
That’s the real world implication of undefined behaviour.
Undefined behavior is "literally anything can happen". So yes, implementations doing "what they feel is the right thing to do" is one possible result of UB (as in this case). It could also emit an rm -rf / call and it would still comply with the standard...
> the reason the optimizer does what you see is undefined behavior due to signed integer overflow
Yes undefined behavior gives the optimizer the right in this case to transform the code into anything, including a nonsense answer, or a trap instruction. But the optimizer did not; it produced the right answer under 2's complement arithmetic.