Y
Hacker News
new
|
ask
|
show
|
jobs
by
reqo
854 days ago
Isn’t that what the softmax layer is doing? The token with highest probability among all the available tokens in the model dictionary is chosen as the next token!
1 comments
danielmarkbruce
854 days ago
no. Softmax layer produces a distribution. What you do with that is up to you. There are numerous ways to choose from that distribution.
link