Hacker News new | ask | show | jobs
by blibble 524 days ago
you won't because floating point arithmetic isn't associative

and the GPU scheduler isn't deterministic

1 comments

You can set PyTorch to deterministic mode with a small performance penalty: https://pytorch.org/docs/stable/notes/randomness.html#avoidi...

Unfortunately, this is only deterministic on the same hardware, but there is no reason why one couldn't write reasonably efficient LLM kernels. It just has not been a priority.

Nevertheless, I still agree with the main point that it is difficult to get LLMs to produce the same output reliably. A small change in the context might trigger all kinds of changes in the generated code.