What I wanted to emphasize is that the training _does_ actually incentivize the model to say "I don't know" but on a lower level.