Hacker News new | ask | show | jobs
by michaeljx 437 days ago
A ok got it, the next token is sampled from a deterministic probability distribution, hence the random output. But why not get the token with the highest probability/weight? Is this to avoid some local minima?
1 comments

It depends on your use case. Deterministic output is less "creative."