Hacker News new | ask | show | jobs
by compumetrika 526 days ago
LLMs use pseudo-random numbers. You can set the seed and get exactly the same output with the same model and input.
1 comments

you won't because floating point arithmetic isn't associative

and the GPU scheduler isn't deterministic

You can set PyTorch to deterministic mode with a small performance penalty: https://pytorch.org/docs/stable/notes/randomness.html#avoidi...

Unfortunately, this is only deterministic on the same hardware, but there is no reason why one couldn't write reasonably efficient LLM kernels. It just has not been a priority.

Nevertheless, I still agree with the main point that it is difficult to get LLMs to produce the same output reliably. A small change in the context might trigger all kinds of changes in the generated code.