|
|
|
|
|
by alwa
848 days ago
|
|
I may be misunderstanding the way LLM practitioners use the word “hallucination,” but I understood it to describe it as something different from the kind of “random” nonsense-word failures that happen, for example, when the temperature is too high [0]. Rather, I thought hallucination, in your example, might be something closer to a grizzled old salesman-map-draftsman’s folk wisdom that sounds like a plausibly optimal mapping strategy to a boss oblivious to the mathematical irreducibility of the problem. Imagining a “fact” that sounds plausible and is rhetorically useful, but that’s never been true and nobody ever said was true. It’ll still be, like your human in the example, better than average (if “average” means averaged across the universe of all possible answers), and maybe even useful enough to convince the people reading the output, but it will be nonetheless false. [0] e.g. https://news.ycombinator.com/item?id=39450669 |
|
The driver certainly cannot be relied on to always find an exact solution to an NP-complete problem. But failure modes matter. For practical purposes, the driver's solution is not simply "false". It's just suboptimal.
If we could get LLMs to fail in a similarly benign way, that would make them far more robust without disproving what the posted paper claims.