| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by schemescape 1096 days ago

I didn’t realize that getting 100% deterministic behavior for floats would require using the same compiler, even when using the same math functions.

Is floating point arithmetic not fully specified or something?

Edit: should have searched the web first: https://stackoverflow.com/questions/49471943/floating-point-...

6 comments

ColFrancis 1095 days ago

As I understand it, it's more that the mathematical functions aren't specified by the standard.

There is no sqrt instruction. So you stard with a taylor approximation, then maybe do a few rounds of Newton's method to refine the answer. But what degree is the taylor series? Where did you centre it (can't be around 0)? How many Newton's methods did you do? IEEE 754 might dictate what multiplication looks like, but for sqrt, sin, cos, tan you need to come up with a method of calculation, and that's in the standard library which usually comes from the compiler. You could make your own implementations but...

Floats are not associative: (a+b)+c != a+(b+c). So even something like reordering fundamental operations can cause divergence.

link

zokier 1095 days ago

> There is no sqrt instruction. So you stard with a taylor approximation, then maybe do a few rounds of Newton's method to refine the answer. But what degree is the taylor series? Where did you centre it (can't be around 0)? How many Newton's methods did you do? IEEE 754 might dictate what multiplication looks like, but for sqrt, sin, cos, tan you need to come up with a method of calculation, and that's in the standard library which usually comes from the compiler. You could make your own implementations but...

This is just plain wrong. IEEE 754 defines many common mathematical functions, sqrt and trig included, and recommends them to return correct values to the last ulp. Most common cpus have hardware instructions for those, although Intels implementation is notoriously bad.

Furthermore there are some high-quality FP libraries out there, like SLEEF and rlibm, and CORE-MATH project that aims to improve the standard libraries in use.

link

cornstalks 1095 days ago

> recommends

That word is really important because it means you can’t rely on it for real determinism.

link

dundarious 1095 days ago

Your general point stands, but with corrections:

- sqrt actually can be a single instruction on x86_64 for example. Look at how musl implements it, it's just a single line inline asm statement, `__asm__ ("sqrtsd %1, %0" : "=x"(x) : "x"(x));`: https://github.com/ifduyue/musl/blob/master/src/math/x86_64/... Of course, not every x64_64 impl must use that instruction, and not every architecture must match Intel's implementation. I've never looked into it, but wouldn't be surprised if even Intel and AMD have some differences.

- operator precedence is well-defined, and IEE754 compliant compilation modes respect the non-associativity. In most popular compilers, you need to pass specific flags to allow the compiler to change the associativity of operations (-ffast-math implies -funsafe-math-optimizations which implies -fassociative-math which actually breaks strict adherence to the program text). A somewhat similar issue does arise with floating point environment though, as statements may be reordered, function arguments order of evaluation may differ, etc.

The fact that compilers respect the non-associativity of program text is a huge reason why compilers are very limited in how much auto-vectorization they will do for floating point. The classic example is a sum of squares, where it bars itself from even loop unrolling, never mind SIMD with FMADDs. To do any of that, you have to fiddle with compiler options that are often problematic when enabled globally, or __attribute__((optimize("...")) or __pragma(optimize("...")), or probably best of all, explicitly vectorize using intrinsics.

link

virtue3 1095 days ago

I'm not quite sure they are guaranteed to be deterministic across architectures either.

I think rounding up/down can vary across architectures. Mixing arm/amd/intel here should result in non-deterministic behavior potentially. https://en.wikipedia.org/wiki/IEEE_754#Reproducibility

Ah... I see. The game is using a software implementation of floating point maths to avoid this issue :). Smart. That in combination with the unified compiler should get you out of most of the trouble.

link

eliasmacpherson 1095 days ago

There's a rather long article here, from Sun, now owned by Oracle - probably easier to start from the conclusion and work back!

https://docs.oracle.com/cd/E77782_01/html/E77791/z4002282485...

link

zokier 1095 days ago

Float determinism is one of those things that in theory floats should behave deterministically, but in practice it's a wild west. One big problem is that in C (and in many others), common operators are not explicitly mapped to specific fp operations, which gives compilers quite a lot of leeway on what they can do.

Another issue is that FP uses some amount of invisible global state, e.g. rounding modes. Iirc for example DirectX likes to change the flags behind your back.

link

chpatrick 1095 days ago

Even using the same compiler, a different -march can give you substantially different results.

link

vlovich123 1095 days ago

I think it’s just about Intel’s intermediate 80bit precision. Anything else?

link

nwallin 1095 days ago

There are lots of different ways floating point determinism can get lost. The obvious one, as you mentioned, is the 80 bit x87 unit. Lots of 32 bit compilers will compile to doing floating point math on the x87, which is slow, but the same compiler compiling in 64 bit mode will use SSE2 instructions.

Floating point arithmetic is not associative. That is, (a+b)+c != a+(b+c) and (a*b)*c != a*(b*\c).

Multiplying by the reciprocal is not equal to dividing by the value. That is, x*(1/y) != x/y. An optimizing compiler may, when you attempt to divide by a constant, will optimize that to multiplying the reciprocal instead, that is, if you have code that divides by the constant 3 it will multiply by 0.33333333333333333, because it's a lot faster.

FMA (fused multiply-add) instructions are more precise and therefore not equal to the same calculation without FMA. That is, a*b+c != a*b+c if one compiler will output FMA instructions and the other one does not. (this will be true even in the same compiler with different flags)

Special functions are fucky. sqrt, sin, cos etc might not always give equal values. Or even in the same compiler if minor alterations to the code are made. A compiler might use one algorithm to compute sin(x), but a different algorithm if it needs to compute both sin(x) and cos(x) at the same time.

Floating point rounding mode is a thing. Sometimes a plugin changes your floating point rounding mode. Sometimes this plugin will be inserted into your runtime without your knowledge, such as an antivirus program, or malware that hijacks your browser to give "better"/"customized" shopping/search recommendations. There was a bug writeup about a crash in Chrome several years back, but I can't find it.

Basically you should assume that floating point math is non-deterministic. If you think you need deterministic floating point math, try to reformulate the problem so that you don't need deterministic floating point math. If you really* need deterministic floating point math, understand that you're signing up for a lot of pain.

link

vlovich123 1094 days ago

AFAIK, the first three things you listed would be deterministic across compilers unless you enabled -ffast-math (which isn't what this is talking about) precisely for that reason. I believe the same applies to intrinsics and rounding modes but not sure, especially since the website talks about GCC vs clang differences for the latter.

[1] https://stackoverflow.com/questions/55974090/clang-gcc-only-...

link

kevingadd 1095 days ago

Any reorderings or optimizations of floating point code could potentially change the results even without 80-bit precision active.

link

vlovich123 1094 days ago

AFAIK only if you have -ffast-math enabled. Otherwise IEE754 very specifically lays out all the optimizations the compiler is and isn't allowed to do.

link

rrobukef 1095 days ago

Constant folding of floats may depend on the compiler of compiler (e.g. 0.5 + x + 0.6 + 0.7), the compiler may be non-deterministic (std::unordered_map<float> constants), and it may depend on the CPU your compiler is running on.

link