|
|
|
|
|
by orbital-decay
16 days ago
|
|
>I wonder why people worry so much more about "determinism" over conformance to a spec. From what I've seen in these HN discussions, most people are using "determinism" when they really mean "prompt sensitivity", i.e. minor variations in framing leading to different results. This, in turn, confuses people who do understand what determinism is supposed to mean and where it's necessary (build reproducibility for example). >nonconformance to a spec This is bound to happen for informal specs. It's the inherent property of the domain both models and humans operate in. Current models are optimized for coding rather than human communication, and have shallow understanding of human intent. Their reading between the lines is pretty poor, and they can't distinguish between important and unimportant details of the prompt, following it too literally and forcing you to give them more context for clarification. |
|
For lack of a better word, I'd also have used "determinism". But to borrow a bit from TFA, what I'd really mean by that would some kind of "semantic determinism": for any input source code in a well-defined language, a correctly working compiler will always produce output that's semantically correct for the input.
Let's say a compiler implementation internally does something random or nondeterministic but that the nondeterminism does not affect the semantics of the output. You could argue that the compiler is technically nondeterministic, but in terms of program semantics it would still be deterministic.
I assume that's what people mean when they say compilers are deterministic in comparison to LLMs.
So in some sense the post is correct, but I think the author is somewhat pedantically misinterpreting the way people use the word "nondeterminism".
IMO prompt sensitivity is something different. A prompt does not unambiguously describe full program semantics in the first place, and the neural network would not contain an explicit mechanism for producing semantically matching output even if it did. Prompt sensitivity comes on top of that but isn't the core matter.