Hacker News new | ask | show | jobs
by mgaunard 1356 days ago
Typical technobabble from someone who doesn't really understand floating-point.

Floating-point is not associative. Reordering operations yields different results, so no compiler will do so, unless you specifically disable standards conformance.

The use of SIMD, which is just a type of instruction-level parallelism, has no effect on the result of floating-point operations, unless of course you reorder your operations so that they may be parallelized.

What does affect the result of floating-point operations is when rounding happens and at what precision. If we're talking about C, the compiler is allowed to run intermediate operations with higher precision than that mandated by its type. This is merely so that it can use x87 which is 96-bit long by default and only round when it spills to memory and needs to store a 64-bit or 32-bit value. Compilers have flags to disable that behaviour, and it doesn't apply when the SSE unit instead of x87 is used. Using SSE for floating-point doesn't necessarily mean it's using SIMD, most of the instructions have scalar variants.

Another example is FMA, which might be substituted for any multiply+add operations.

In practice if your code breaks with this it just means it was incorrect in the first place.

2 comments

The actual rules are very complicated. C allows greater precision for intermediate results but compilers are sometimes careful to stick to IEEE rounding. [1] contains a good general overview, and [2] talks about FMA in particular. And in [3] I've set up a Godbolt example to play with. By default -O3 gives you FMA, but -O or -O3 with -ffp-contract=off don't. So you absolutely can get different results depending on optimization levels.

[1]: https://randomascii.wordpress.com/2012/03/21/intermediate-fl...

[2]: https://kristerw.github.io/2021/11/09/fp-contract/

[3]: https://godbolt.org/z/eTz8o6b3P

The rule is very simple, I'm not seeing anything in what you say suggesting that it isn't?
Perhaps the rule in the standard is simple - the compiler can arbitrarily round to finer precision than IEEE, but in practice it's complicated as the same code can behave quite differently depending on what chip it's compiled for, the level of optimizations, and other factors. If you want to control it, ie model it as something other than nondeterminism, figuring out the right combination of compiler flags and so on is tricky.

I'll also point out that fma is relatively new, so it's pretty easy to write code that works fine when compiled with default x86_64/SSE2 but will break when compiled for a more recent target cpu.

All you're doing is listing simple consequences of the rule.

The compiler may use increased precision for intermediate computations. That means sometimes it will, sometimes it won't. If you understand the basics of the situations where it will do so, you can see it depends on register allocation, which of course not only depends on optimization level, but also can change anytime you change anything at all in the source code.

You do realize programming languages don’t necessarily manipulate the architectures native floating point operations, but are free to define any semantics they want? You know, like it could have number types that work like in math, e.g. symbolic math tools does exactly that.

Also, that kind of language is absolutely not warranted.