The spec doesn’t prevent auto-vectorization, it only says the language should avoid it when it wants to opt in to producing “reproducible floating-point results” (section 11 of IEEE 754-2019). Vectorizing can be implemented in different ways, so whether a language avoids vectorizing in order to opt in to reproducible results is implementation dependent. It also depends on whether there is an option to not vectorize. If a language only had auto-vectorization, and the vectorization result was deterministic and reproducible, and if the language offered no serial mode, this could adhere to the IEEE spec. But since C++ (for example) offers serial reductions in debug & non-optimized code, and it wants to offer reproducible results, then it has to be careful about vectorizing without the user’s explicit consent.
If you write a loop `for x in array { sum += x }` Then your program is a specification that you want to add the elements in exactly that order, one by one. Vectorization would change the order.
The bigger problem there is the language not offering a way to signal the author’s intent. If an author doesn’t care about the order of operations in a sum, they will still write the exact same code as the author who does care. This is a failure of the language to be expressive enough, and doesn’t reflect on the IEEE spec. (The spec even does suggest that languages should offer and define these sorts of semantics.) Whether the program is specifying an order of operations is lost when the language offers no way for a coder to distinguish between caring about order and not caring. This is especially difficult since the vast majority of people don’t care and don’t consider their own code to be a specification on order of operations. Worse, most people would even be surprised and/or annoyed if the compiler didn’t do certain simplifications and constant folding, which change the results. The few cases where people do care about order can be extremely important, but they are rare nonetheless.
They are, just check anything fixed-point for the 486SX vs anything floating under a 486DX. It's faster scaling and sum and print the desired precision than operating on floats.
I wonder... couldn't there just be some library type for this, e.g. `associative::float` and `associative::doube` and such (in C++ terms), so that compilers can ignore non-associativity for actions on values of these types? Or attributes one can place on variables to force assumption of associativity?
While it technically correct to say this it also gets the wrong point across because it leaves out the fact that ordering changes create only a small difference. Other examples where arithmetic is not commutative, e.g.
matrix multiplication , can create much larger differences.
Floating-point arithmetic is non-associative, but it is commutative for the operations that are algebraically commutative: x + y == y + x and x*y == y*x. And x - y = -(y - x) so subtraction is properly anti-commutative.
The only very marginal exception to this is that when both arguments are NaN, the return value will be NaN, but which NaN payload is returned can depend on argument order. But no one ever uses this because it's not specified, so it can't be used reliably for anything useful. The behavior I wish IEEE 754 had specified for this is to define a standard NaN value (or two), and when the return value of an op is NaN, and some of the arguments are non-standard NaNs, then one of those non-standard NaN values must be returned. This doesn't depend on argument order and allows NaN payloads to be reliably propagated, which would let you encode useful debugging information in NaN payloads and know that it will flow through the program.
For mathematical use, NaN payloads shouldn’t matter, and behave identically (aside from quiet vs. signaling NaNs). It also doesn’t matter for equality comparison, because NaNs always compare unequal.
from the user perspective it's not too bad, but from the compiler perspective it is. The result of this is that LLVM has decided that trying to figure out which nan you got (e.g. by casting to an Int and comparing) is UB, which means pretty much every floating point operation becomes non-deterministic.
This also adds extra complexity to the CPU. you need special hardware for == rather than just using the perfectly good integer unit, and every fpu operation needs to devote a bunch of transistors to handling this nonsense that buys the user absolutely nothing.
there are definitely things to criticize about the design of Posits, but the thing they 100% get right is having a single NaN and sane ordering semantics