| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gloxkiqcza 354 days ago
	Correct me if I’m wrong but LLMs are deterministic, the randomness is added intentionally in the pipeline.

2 comments

mzl 354 days ago

LLMs can be run in a mostly deterministic mode (see https://docs.pytorch.org/docs/stable/notes/randomness.html for some info on running PyTorch programs).

Varying the deployment type (chip model, number of chips, batch size, ...) can also change the output due to rounding errors. See https://arxiv.org/abs/2506.09501 for some details on that.

link

zekica 354 days ago

The two parts of your statement don't go together. A list of potential output tokens and their probabilities are generated deterministically but the actual token returned is then chosen at random (weighted based on the "temperature" parameter and the probability value).

link

galaxyLogic 354 days ago

I assume they use software-based pseudo-random-number generators. Those can typically be given a seed-value which determines (deterministically) the sequence of random numbers that will be generated.

So if an LLM uses a seedable pseudo-random-number-generator for its random numbers, then it can be fully deterministic.

link

lou1306 354 days ago

There are subtle sources of nondeterminism in concurrent floating point operations, especially on GPU. So even with a fixed seed, if an LLM encounters two tokens with very close likelihoods, it may pick one or the other across different runs. This has been observed even with temperature=0, which in principle does not involve _any_ randomness (see arXiv paper cited earlier in this thread).

link

mzl 354 days ago

That depends on the sampling strategy. Greedy sampling takes the max token at each step.

link