Hacker News new | ask | show | jobs
by famouswaffles 659 days ago
That's not true. There's been several papers probing this with different methodologies and the conclusion is pretty clear.

LLMs know a whole lot more about the uncertainty of their predictions than they say.

GPT-4 logits calibration pre RLHF - https://imgur.com/a/3gYel9r

Language Models (Mostly) Know What They Know - https://arxiv.org/abs/2207.05221

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets - https://arxiv.org/abs/2310.06824

The Internal State of an LLM Knows When It's Lying - https://arxiv.org/abs/2304.13734

LLMs Know More Than What They Say - https://arjunbansal.substack.com/p/llms-know-more-than-what-...

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback - https://arxiv.org/abs/2305.14975

Teaching Models to Express Their Uncertainty in Words - https://arxiv.org/abs/2205.14334

1 comments

This is good stuff, thanks for sharing