| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by HarHarVeryFunny 810 days ago

Not quite... temperature and token selection are two different things.

At the output of an LLM the raw next-token prediction values (logits) are passed through a softmax to convert them into probabilities, then these probabilities drive token selection according to the chosen selection scheme such as greedy selection (always choose highest probability token), or a sampling scheme such as top-k or top-p. Under top-k sampling a random token selection is made from one of the top k most probable tokens.

The softmax temperature setting preserves the relative order of output probabilities, but at higher temperatures gives a boost to outputs that would otherwise have been low probability such that the output probabilities are more balanced. The effect of this on token selection depends on the selection scheme being used.

If greedy selection was chosen, then temperature has no effect since it preserves the relative order of probabilities, and the highest probability token will always be chosen.

If a sampling selection scheme (top-k or top-p) was chosen, then increased temperature will have boosted the likelihood of sampling choosing an otherwise lower probability token. Note however, that even with the lowest temperature setting, sampling is always probabilistic, so there is no guarantee (or desire!) for the highest probability token to be selected.