| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by colejohnson66 1199 days ago
	But how often is determinism in the LSB really needed?

6 comments

jcranmer 1198 days ago

There are some things that end up "accidentally" demanding exact equality of floats. Comparing the reference output of a program that writes floating-point numbers comes to mind. A multiplayer game that relies on each client computing gamestate locally can get desyncs to happen if two clients compute floats that differ even by that last bit.

Another, admittedly niche, scenario is that numerics code can use lower-precision types to emulate higher-precision types (e.g., represent a number with a pair of doubles).

nine_k 1198 days ago

AFAICT any modern multiplayer games use a central node that forces a coherent state of the world on all clients, otherwise cheating through desync becomes a serious problem. (Trusting clients in adversarial transactions, like deathmatch, is hard in general: cheating through altered rendering also used to be a thing.)

yvdriess 1198 days ago

Depends on the game and platform. RTS games tend to still have peer to peer netcode. Deterministic simulation is still a thing.

Here is an interesting GDC talk on building a fixed precision engine core to ensure determinism: https://youtu.be/wwLW6CjswxM

bee_rider 1198 days ago

This is totally off the cuff, but if you were using a pair of doubles like that, you’d essentially need to represent these new operations with numerical algorithms on the pair of doubles, right? So the algorithms would presumably need to deal with the last bit being a bit dodgy at times, right?

pclmulqdq 1198 days ago

With appropriate rounding modes, the basic operations (+, -, *) are fairly short code sequences. Things like exp and 1/x can become adventures, but they're not too bad.

stephencanon 1198 days ago

Division is actually pretty straightforward: compute a residual using the multiply-add that you already have, divide _that_, and then add it to the quotient.

Or (roughly equivalently, but maybe easier to understand) do a native double division, then do a Newton-Raphson step, which requires only multiplication and addition (just like you would refine a reciprocal estimate).

andrewmcwatters 1198 days ago

Happens very frequently in multiplayer games. As a result, if you don't have it, you can't byte-compare floats and need to have an "approximate" function with some predefined epsilon slop.

magicalhippo 1198 days ago

I recall LHC@Home had some issue with this[1]. They ran simulations of beams in the LHC using BOINC, each run could be up to a million revolutions. The goal was to do parameter searches to find the best magnet settings before going live, so they could spend less time tuning the real thing.

Anyway, as with other BOINC projects, they sent each work unit (simulation run) to at least two (or was it three?) different computers and compared results, to ensure correctness. And they found that they got quite a lot of work units which disagreed and had to be sent to more computers for validation.

After some digging and eliminating factors like overclocked CPUs, they found that usually, all Intel machines would agree and all AMDs would agree, but Intel and AMD would disagree. Like, a run that would hit the detector wall after 30 revolutions on Intels could go on for many thousands of revolutions on AMDs.

Further digging led to the discrepancy in lower bits of transcendental operations in the FPUs[2]. After switching to a software library for these operations, at the cost of a few % in speed, they got Intels and AMDs to agree.

So yeah, when you do a large number of iterated operations like this, even a single LSB of difference can lead to issues.

As an aside, the LHC@Home was initially run almost like a hobby project by a few researchers connected to the LHC, without much official support. However the data the project produced was AFAIK highly beneficial to the machine commissioning, and it later became a more official part of the High Luminocity upgrade.

[1]: https://cds.cern.ch/record/1463291/files/CERN-ATS-2012-159.p...

[2]: https://epaper.kek.jp/icap06/PAPERS/MOM1MP01.PDF

arnoldjm 1197 days ago

I looked into this issue with rsqrt (and with rcp as well) between Intel and AMD processors in connection with CERN some years ago (2016). An unpublished report can be found at [1].

TL;DR: The same (very small) executables gave different results when run on Intel and AMD processors because the rsqrt and rcp instructions produced slightly different outputs on the two systems.

[1]: https://github.com/jeff-arnold/math_routines/blob/main/rsqrt....

magicalhippo 1197 days ago

Interesting results, thanks for sharing!

hansvm 1195 days ago

Determinism in the LSB is often a prerequisite for epsilon-determinism in the result. Algorithms with many possible solutions and any step that bifurcates on a floating point value should be treated with suspicion.

Classic examples include most under-constrained randomized algorithms, like training a neural network. Rejection sampling is required to accurately produce some sorts of randomness, and that yields bifurcations in the initialization if you don't have LSB determinism. The complicated loss landscape then virtually guarantees you'll converge to a different minima. Even with a deterministic seed, algorithms guaranteed to converge, a principled way to ensure that concurrent computations yield bitwise identical results no matter the execution order, and most of your operations being bitwise identical, a few stray LSB issues in your inverse square roots or transcendentals will still nearly ensure that the final result isn't even approximately the same.

As to why that latter thing matters, it varies, but at a minimum reproducibility makes lots of federated processes cheaper (and not just federated in the "folding at home" sense, but generally when some people are performing computations and other people are making actions based on them -- being able to explain credit scores or parole denials or whatever, or validating that several people you trust yield the same compiled binary). Bitwise reproducibility would be better, but even being approximately right would probably be good enough and isn't tenable without bitwise identical building blocks.

moralestapia 1198 days ago

It is typical to have a need for exact bit-by-bit equality between outputs. For integrity and security purposes.

ufo 1199 days ago

What does LSB mean?

nomel 1198 days ago

For IEE 754 [1] the last bit of the the mantissa [2] is also the last bit of the binary representation. So, changing it is results in the smallest possible change in the number.

bee_rider 1199 days ago

Least significant bit