| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by raincole 181 days ago
	Low level numerical operation optimizations are often not reproduceable. For example: https://www.intel.com/content/dam/develop/external/us/en/doc... (2013) But it's still surprising that that LLM doesn't work on iPhone 16 at all. After all LLMs are known for their tolerance to quantization.

1 comments

bri3d 181 days ago

Yes, "floating point accumulation doesn't commute" is a mantra everyone should have in their head, and when I first read this article, I was jumping at the bit to dismiss it out of hand for that reason.

But, what got me about this is that:

* every other Apple device delivered the same results

* Apple's own LLM silently failed on this device

to me that behavior suggests an unexpected failure rather than a fundamental issue; it seems Bad (TM) that Apple would ship devices where their own LLM didn't work.

sva_ 180 days ago

> floating point accumulation doesn't commute

It is commutative (except for NaN). It isn't associative though.

ekelsen 180 days ago

I think it commutes even when one or both inputs are NaN? The output is always NaN.

addaon 180 days ago

NaNs are distinguishable. /Which/ NaN you get doesn't commute.

ekelsen 180 days ago

I guess at the bit level, but not at the level of computation? Anything that relies on bit patterns of nans behaving in a certain way (like how they propagate) is in dangerous territory.

addaon 180 days ago

> Anything that relies on bit patterns of nans behaving in a certain way (like how they propagate) is in dangerous territory.

Why? This is well specified by IEEE 754. Many runtimes (e.g. for Javascript) use NaN boxing. Treating floats as a semi-arbitrary selection of rational numbers plus a handful of special values is /more/ correct than treating them as real numbers, but treating them as actually specified does give more flexibility and power.

DavidVoid 180 days ago

Unless you compile with fast-math ofc, because then the compiler will assume that NaN never occurs in the program.

DavidVoid 180 days ago

I would go even further and state that "you should never assume that floating point functions will evaluate the same on two different computers, or even on two different versions of the same application", as the results of floating point evaluations can differ depending on platform, compiler optimizations, compilation-flags, run-time FPU environment (rounding mode, &c.), and even memory alignment of run-time data.

There's a C++26 paper about compile time math optimizations with a good overview and discussion about some of these issues [P1383]. The paper explicitly states:

1. It is acceptable for evaluation of mathematical functions to differ between translation time and runtime.

2. It is acceptable for constant evaluation of mathematical functions to differ between platforms.

So C++ has very much accepted the fact that floating point functions should not be presumed to give identical results in all circumstances.

Now, it is of course possible to ensure that floating point-related functions give identical results on all your target machines, but it's usually not worth the hassle.

[P1383]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p13...

physicsguy 180 days ago

Even the exact same source code compiled with different compilers, or the same compiler with different compiler options.

Intel Compiler for e.g. uses less than IEEE764 precision for floating point ops by default, for example.

danpalmer 180 days ago

FYI, the saying is "champing at the bit", it comes from horses being restrained.

jasinjames 179 days ago

Huh. I never knew "champing" was the proper spelling [0]

[0] https://www.npr.org/sections/memmos/2016/06/09/605796769/che...

mylifeandtimes 180 days ago

hey, I appreciate your love of language and sharing with us.

I'm wondering if we couldn't re-think "bit" to the computer science usage instead of the thing that goes in the horse's mouth, and what it would mean for an AI agent to "champ at the bit"?

What new sayings will we want?

nilamo 180 days ago

Byting at the bit?

odo1242 180 days ago

chomping at the bit

danpalmer 180 days ago

Actually it was originally "champing" – to grind or gnash teeth. The "chomping" (to bite) alternative cropped up more recently as people misheard and misunderstood, but it's generally accepted as an alternative now.

odo1242 179 days ago

I see

kortilla 180 days ago

It’s actually accepted as the primary now and telling people about “champing” is just seen as archaic.

danpalmer 180 days ago

Do you have a source on this, or a definition for what it means to be "primary" here? All I can find is sources confirming that "champing" is the original and more technically correct, but that "chomping" is an accepted variant.

BeetleB 180 days ago

As a sister comment said, floating point computations are commutative, but not associative.

a * b = b * a for all "normal" floating point numbers.