The processor itself supports configuring some aspects of floating point behavior; on x86 this is the MXCSR register. The most important one is rounding mode: you can choose round-to-nearest, round-down, round-up, or round-towards-zero. Less common options include whether to raise an exception on various out-of-range conditions, and whether to treat denormal numbers as zero (faster but less accurate and not IEEE 754 compliant).
The C standard library has functions to do this configuration; search for fesetenv. _fpreset is not a standard function but, at least in Wine's implementation [1], just resets the relevant configuration registers to an initial state.
At a minimum, when there is a switch between threads the floating point state of the old thread needs to be saved and the floating point state of the incoming thread loaded. This includes such things as rounding mode.
If the thread package was not setting up the "saved" state for a new thread before switching to it then that state could be not what was desired. The best thing to do would probably be to copy the current FP state of the parent at the moment the thread is created.
On a long series of calculations changing the rounding mode could well be enough to cause the difference in results the poster saw.
I would tend to the opinion that redoing the calculation with different rounding modes is in fact a not bad way to figure out how reliable the results are in the first place.
(the rounding mode is not the only thing affected by not saving and restoring the FP state correctly)
Some kind of internal state specific to Microsoft's libm? It doesn't look like _fpreset() is present anywhere else.
Though in general, IEEE-754 math does have a small amount of global state, for stuff like rounding modes and subnormals. Perhaps the spawned thread's fenv was unexpected.
I'm not super convinced by this blog post anyway, since it immediately confuses associativity with commutativity.
Op here. Yea this only occurs on Windows with MinGW. Visual C++ and Clang do not have this issue. None of the Linux compilers I tested also had this issue.
And yea you're right on the associative mistake. Will correct this :)
TBH. I had no idea this was a thing either. I was incredibly confused when I identified a single piece of code that was producing an variation in result.
Well in most cases I've looked at generated assembly (not that often), the xmm registers are used even for scalar operations, which I thought was the default option for gcc on x86-64, but I suppose it might differ on different systems (or perhaps 32-bit mode was used for some reason).
Right, if you're using xmm regs you're getting double or single precision. Sometimes the x87 regs get used and that's when "accidentally computed with extra precision and then double-rounded" comes up.
The C standard library has functions to do this configuration; search for fesetenv. _fpreset is not a standard function but, at least in Wine's implementation [1], just resets the relevant configuration registers to an initial state.
[1] https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2...