|
|
|
|
|
by sebzim4500
1157 days ago
|
|
Sorry if I was unclear. I know that the model is incentivised to accurately predict the probability distribution of the next token. I mean that the model is not being incentivised to literally produce the output tokens corresponding to "I don't know" when asked a question where it is uncertain. |
|
What I wanted to emphasize is that the training _does_ actually incentivize the model to say "I don't know" but on a lower level.