| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lysium 1968 days ago
	I would have never thought that the output of some floating point operations depend on the „reset of the floating point package“. What gives?

6 comments

comex 1968 days ago

The processor itself supports configuring some aspects of floating point behavior; on x86 this is the MXCSR register. The most important one is rounding mode: you can choose round-to-nearest, round-down, round-up, or round-towards-zero. Less common options include whether to raise an exception on various out-of-range conditions, and whether to treat denormal numbers as zero (faster but less accurate and not IEEE 754 compliant).

The C standard library has functions to do this configuration; search for fesetenv. _fpreset is not a standard function but, at least in Wine's implementation [1], just resets the relevant configuration registers to an initial state.

[1] https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2...

lysium 1968 days ago

Very informative! It even sets SSE2.

brucehoult 1968 days ago

At a minimum, when there is a switch between threads the floating point state of the old thread needs to be saved and the floating point state of the incoming thread loaded. This includes such things as rounding mode.

If the thread package was not setting up the "saved" state for a new thread before switching to it then that state could be not what was desired. The best thing to do would probably be to copy the current FP state of the parent at the moment the thread is created.

On a long series of calculations changing the rounding mode could well be enough to cause the difference in results the poster saw.

I would tend to the opinion that redoing the calculation with different rounding modes is in fact a not bad way to figure out how reliable the results are in the first place.

(the rounding mode is not the only thing affected by not saving and restoring the FP state correctly)

willxinc 1968 days ago

Things like rounding mode and denornal behavior could be controlled by CPU flags. Imy not familiar with x86, but this is defiy the case in PPC.

lysium 1968 days ago

Yes, that makes sense, thank you!

wiml 1968 days ago

Some kind of internal state specific to Microsoft's libm? It doesn't look like _fpreset() is present anywhere else.

Though in general, IEEE-754 math does have a small amount of global state, for stuff like rounding modes and subnormals. Perhaps the spawned thread's fenv was unexpected.

I'm not super convinced by this blog post anyway, since it immediately confuses associativity with commutativity.

zaitanz 1968 days ago

Op here. Yea this only occurs on Windows with MinGW. Visual C++ and Clang do not have this issue. None of the Linux compilers I tested also had this issue.

And yea you're right on the associative mistake. Will correct this :)

chrchang523 1968 days ago

I currently build Windows binaries via cross-compilation on Linux using gcc-mingw-w64; I assume that is affected by the bug?

zaitanz 1968 days ago

Most likely. You can check with the pastebin code https://pastebin.com/thTapSgn . Local and Thread outputs should be the same.

zaitanz 1968 days ago

TBH. I had no idea this was a thing either. I was incredibly confused when I identified a single piece of code that was producing an variation in result.

cperciva 1968 days ago

Probably the author is accidentally getting 80-bit "extended double" arithmetic.

cozzyd 1968 days ago

That should be clear from looking at the assembly, shouldn't it? Don't think the floating point environment can change that but I could be wrong.

Floating-point rounding modes seem more likely to me. The author should be able to dump the floating point configuration to confirm, I'm sure.

cperciva 1968 days ago

The same x86 instructions are used for arithmetic on float/double/extended types; the difference is a precision setting in the x87 control word.

cozzyd 1968 days ago

Well in most cases I've looked at generated assembly (not that often), the xmm registers are used even for scalar operations, which I thought was the default option for gcc on x86-64, but I suppose it might differ on different systems (or perhaps 32-bit mode was used for some reason).

cperciva 1968 days ago

Right, if you're using xmm regs you're getting double or single precision. Sometimes the x87 regs get used and that's when "accidentally computed with extra precision and then double-rounded" comes up.