| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sgk284 1119 days ago
	OpenAI touches a little on this on page 12 of the GPT-4 technical report (https://cdn.openai.com/papers/gpt-4.pdf). Prior to aligning to safer outputs, the model's confidence in an answer is highly correlated with that actual accuracy of the answer. After alignment though, the model's confidence in its answers is basically arbitrary and has no bearing on whether or not the answer is actually correct.