Hacker News new | ask | show | jobs
by stratos123 26 days ago
"Defeating Nondeterminism in LLM Inference" ( https://thinkingmachines.ai/blog/defeating-nondeterminism-in...) has a repo: https://github.com/thinking-machines-lab/batch_invariant_ops

which seems to have eventually been merged into vllm: https://docs.vllm.ai/en/latest/features/batch_invariance/

So you can get determinism locally. On a cursory search I wasn't able to find any LLM provider advertising determism; if you need it for research you might have to rent a dedicated GPU pod and run vllm there with the appropriate settings.

1 comments

thanks