Hacker News new | ask | show | jobs
by bigyabai 437 days ago
Generative AI typically is deterministic, most inference software includes a random seed to yield different results on each repeated entry.
1 comments

That's not strictly correct. All LLMs output logits softmax'd into a probability distribution of the next token, and this distribution is indeed deterministic.

Most generative AI apps set a nonzero temperature which scales the probability. So if you have a distribution with 50%, 30%, 20% for tokens, and a temperature of 1, then you'd up to 3 different outputs sampled at those exact probabilities, which iteratively cascade into completely different texts. The RNG of the probability selections can be controlled by a seed but with distributed systems that is often not the case: I've only seen seeds returned for cases where the entire model is on a single system. Otherwise, just not using a seed is fine for sufficient randomness.

If the temperature is 0, then it instead chooses the token with the highest probability, and done iteratively the final output will be the same. (this is not accounting for distributed system weirdness)

A ok got it, the next token is sampled from a deterministic probability distribution, hence the random output. But why not get the token with the highest probability/weight? Is this to avoid some local minima?
It depends on your use case. Deterministic output is less "creative."