Hacker News new | ask | show | jobs
by deadbabe 521 days ago
They are deterministic at 0 temperature
5 comments

At zero temp there is still non-determism due to sampling and the fact that floating point addition is not commutative so you will get varying results due to parallelism.
(Disclaimer: I know literally nothing about LLMs.) Wouldn't there still be issues of sensitivity, though? Like, wouldn't you still have to ensure that the wording of your commands stays exactly the same every time? And with models that take less discrete data (e.g. ChatGPT's new "advanced voice model" that works on audio directly), this seems even harder.
s/advanced voice model/advanced voice mode/ (too late for me to edit my original comment)
They are pretty deterministic then but they are also pretty useless at 0 temperature.
Not for the leading LLMs from OpenAI and Anthropic.
Not really, not in practice. The order of execution is non-deterministic when running on a cluster or a gpu, or more than one core of the CPU and rounding errors propagate differently on each run.