|
|
|
|
|
by mmoskal
799 days ago
|
|
You can ask the model sth like: is xyz correct, answer with one word, either Yes or No. The log probs of the two tokens should represent how certain it is. However, apparently RLHF tuned models are worse at this than base models. |
|