Hacker News new | ask | show | jobs
by thxg 1250 days ago
> It is pretty crazy imho that gcc defaults to using fma

Yes! Different people can make different performance-vs-correctness trade-offs, but I also think reproducible-by-default would be better.

Fortunately, specifying a proper standard (e.g. -std=c99 or -std=c++11) implies -ffp-contract=off. I guess specifying such a standard is probably a good idea independently when we care about reproducibility.

Edit: Thinking about it, it the days of 80-bit x87 FPUs, strictly following the standard (specifically, always rounding to 64 bits after every operation) may have been prohibitively expensive. This may explain gcc's GNU mode defaulting to -ffast-math.

2 comments

GCC doesn't default to non-conforming behaviour like -ffast-math -- that's Intel (at least a similar option). That's usually why people mistakenly think GCC vectorization is deficient if they don't use -funsafe-math-optimizations in particular.
Indeed GCC does not enable -ffast-math by default. Unfortunately, -ffast-math and -funsafe-math-optimizations (despite the name) are not the only options that prevent bit-for-bit-reproducible floating point. For example, -ffp-contract=fast is enabled by default [1], and it will lead to different floating-point roundings: Compare [2] which generates an FMA instruction, to [3] when -std=c99 is specified. As another example, -fexcess-precision=fast is also enabled by default. Similarly, [4] does intermediate calculations in the 80-bit x87 registers, while [5] has additional loads and stores to reduce the precision of intermediate results to 64 bits. In both examples, GCC generates code that does not conform to IEEE-754, unless -std=c99 is specified.

[1] From the man page:

    -ffp-contract=style
           -ffp-contract=off disables floating-point expression
           contraction.  -ffp-contract=fast enables floating-point
           expression contraction such as forming of fused multiply-
           add operations if the target has native support for them.
           -ffp-contract=on enables floating-point expression
           contraction if allowed by the language standard.  This is
           currently not implemented and treated equal to
           -ffp-contract=off.
           
           The default is -ffp-contract=fast.
[2] https://godbolt.org/z/GKb7G4nW9

[3] https://godbolt.org/z/KTnqcT6aW

[4] https://godbolt.org/z/4q31oEe14

[5] https://godbolt.org/z/qdf4hceca

> Edit: Thinking about it, it the days of 80-bit x87 FPUs, strictly following the standard (specifically, always rounding to 64 bits after every operation) may have been prohibitively expensive

afaik you could just set the precision of x87 to 32/64/80 bits and there would not be any extra cost to the operations