| HN Mirror

You can set PyTorch to deterministic mode with a small performance penalty: https://pytorch.org/docs/stable/notes/randomness.html#avoidi...

Unfortunately, this is only deterministic on the same hardware, but there is no reason why one couldn't write reasonably efficient LLM kernels. It just has not been a priority.

Nevertheless, I still agree with the main point that it is difficult to get LLMs to produce the same output reliably. A small change in the context might trigger all kinds of changes in the generated code.