Hacker News new | ask | show | jobs
by sebzim4500 1157 days ago
Sorry if I was unclear. I know that the model is incentivised to accurately predict the probability distribution of the next token. I mean that the model is not being incentivised to literally produce the output tokens corresponding to "I don't know" when asked a question where it is uncertain.
1 comments

Yes, exactly.

What I wanted to emphasize is that the training _does_ actually incentivize the model to say "I don't know" but on a lower level.

If only the OpenAI api gave us the token probabilities like it used to.