|
|
|
|
|
by tshadley
556 days ago
|
|
> ...proving that this one particular piece of the hallucination problem may be conceptually simple. Everything mentioned in the article boils down to that one particular piece-- non-detected uncertainty. The architecture constraints referenced are all situations that cause uncertainty. Training data gaps of course increase uncertainty. Their solutions are a shotgun blast of heuristics that all focus on reducing uncertainty-- CoT, RAG, fine-tuning, fact-checking -- while somehow avoiding actually measuring uncertainty and using that to eliminate hallucinations! |
|
Everything unwanted is error, by definition. All of the heuristics are about reducing error, because that's what the goal is. Some of that error is measurable. Some of it is not. You cannot "actually measure" error in any way other than asking people whether the output is what they want -- and that only works because that's how we're defining error. (It also turns out to not be that great of a definition, since people disagree on a lot of cases.)
You can come up with some metric that you label "uncertainty", and that metric may very well be measurable. But it's only going to be correlated with error, not equal to it.
One random example to illustrate the distinction: training gaps can easily decrease uncertainty. You have lots of mammals in your training data, and none of them lay eggs. You ask "The duck-billed platypus is my favorite mammal! Does it lay eggs?" Your model will be very confident when it responds "No". That is a high-confidence error.