|
|
|
|
|
by fancyfredbot
529 days ago
|
|
The author is saying that the output token is not deterministic. I don't think they said the distribution was stochastic. Even so the distribution of the second token output by the model would be stochastic (unless you condition on the first token). So in that sense there may also be a stochastic probability distribution. |
|
You could still easily model the next token as a conditional probability distribution though if you wanted; the computation of entropy just might be a bit spendier.