Hacker News new | ask | show | jobs
by kazinator 3701 days ago
The result of the addition isn't order dependent; whether or not there is overflow is order-dependent.

I put that into my original comment for the sake of completeness, in an additional edit after the floating-point remark. Ironically, it was to fend off smart-alecky follow-ups about how I forgot integer overflow.

A higher-level language should have a plus operator/function that traps integer overflows if it doesn't provide bignums.

If a sum function is defined as a simple left fold over the sequence using that plus, then part of the semantics of that sum is to throw the exception as soon as adding the next element from the sequence to the left accumulator overflows.

If an order-permuting operation is removed from the expression, it can introduce an overflow where there was none. Perhaps the programmer had that apparently useless order-permuting operation precisely to prevent overflow.

1 comments

I was going to ask, once again, how that is possible, but I think I figured it out while reading this, though you don't explicitly state it.

It can happen if some of the numbers are negative. For a simple example, if overflow occurs at 100, 50+60-20 overflows, but 50-20+60 does not.

It's very hard to be aware of the numerous details you find obvious yourself when you're communicating.

Also, by the way, I have an intuition that the issue might not specifically prevent the removal of a reverse operation. That is to say, if a series with some positives or negative integers does/doesn't overflow under a left fold, I have a hunch that it does/doesn't under a right fold.

If this can be proven, then that specific case is fine over integers (optimizing away reverse).

[1, intmax, -1]
So, yes, one direction overflows and one doesn't, but as long as the contract says overflow works as two's compliment, the answer is the same. This is why not having undefined behavior is so useful.
It doesn't have to be two's compliment - for any modular arithmetic, the answer will be the same.

But that's at a cost of giving up knowledge of whether it overflowed. In this case it overflows and underflows, when it does, an equal number of times, so there happens to be no problem. In some cases, all you care about is the answer modulo some number which jives well with your numbers' representation. In that case, there is also no problem.

In many cases, if there is a different number of overflows than underflows, you'd get the wrong answer without knowing it, and your answer is likely (but not guaranteed) to be very wrong when you do.

There are other approaches we can take. Overflow could saturate. In that case, we get the wrong answer in one direction here but it's only off by a little instead of a lot. Overflow could also trap, in which case we'd notice that we had risk of giving the wrong answer.

In either of these cases, we could not apply this optimization unless we allow for some measure of undefined behavior - which is why undefined behavior is so useful. It is absolutely the case that undefined behavior in a specification has significant drawbacks as well - but the issue at hand here is not the result of them.