|
|
|
|
|
by antonvs
33 days ago
|
|
> “Policy” here refers to a probability distribution, i.e. a function that, given some context, assigns probabilities to possible next tokens. This should say "...refers to a function that produces a probability distribution." The latter half of the quoted sentence describes it correctly. |
|