|
|
|
|
|
by sgk284
1119 days ago
|
|
OpenAI touches a little on this on page 12 of the GPT-4 technical report (https://cdn.openai.com/papers/gpt-4.pdf). Prior to aligning to safer outputs, the model's confidence in an answer is highly correlated with that actual accuracy of the answer. After alignment though, the model's confidence in its answers is basically arbitrary and has no bearing on whether or not the answer is actually correct. |
|