Y
Hacker News
new
|
ask
|
show
|
jobs
by
michaeljx
437 days ago
A ok got it, the next token is sampled from a deterministic probability distribution, hence the random output. But why not get the token with the highest probability/weight? Is this to avoid some local minima?
1 comments
minimaxir
436 days ago
It depends on your use case. Deterministic output is less "creative."
link