Hacker News new | ask | show | jobs
by 2ndorderthought 47 days ago
LLMs "hallucinate" because they are stochastic processes predicting the next word without any guarantees at being correct or truthful. It's literally an unavoidable fact unless we change the modelling approach. Which very few people are bothering to attempt right now.

Training data quality does matter but even with "perfect" data and a prompt in the training data it can still happen. LLMs don't actually know anything and they also don't know what they don't know.

https://arxiv.org/abs/2401.11817

2 comments

> they also don't know what they don't know

they sort of do tho:

https://transformer-circuits.pub/2025/introspection/index.ht...

I won't quibble even though I likely should. Have to remember this is HN and companies need to shill their work otherwise ... Yes.

I will play along and assume this is sound. 10-40% +/- 10% is along the lines of "sort of" in a completely unreliable, unguaranteed and unproven way sure.

That’s not the only issue. They also have the problem that they’re built to always give an affirmative answer and to use authoritative wording, even when confidence is low. If they were trained to answer “I don’t know” instead of guessing, they’d hallucinate a lot less, but nobody seems to want that.

It calls to mind the issue of search engines that refuse to return “0 results found” anymore. Now they all try to give you related but ultimately incorrect results.

To me, that feels like gaslighting. It’s like if you ask someone to buy cheddar cheese at the store and they come back with mozzarella, and instead of admitting that the store was out of cheddar, they try to convince you that you actually really want mozzarella.

> If they were trained to answer “I don’t know”

If they were trained that an answe of "I don't know" was an acceptable answer, the model would be prone to always say "I don't know" because it's a universally acceptable answer.

It's a better answer even if it does "know".

That could be fixed with the right scoring scheme in training. The SAT exam (for college-bound high school students in the US) used a scheme like this for multiple choice questions. Correct answers are awarded 3 points (with choices a,b,c,d), incorrect answers are penalized with -1 point, and leaving the answer blank (equivalent to "I don't know") is worth 0 points. This way, the expected value of guessing a random answer when the student doesn't know is 0 points so you might as well leave it blank if your confidence in the answer is no better than a random guess.