Hacker News new | ask | show | jobs
by fauigerzigerk 848 days ago
Perhaps I'm not using the vocabulary correctly here.

What I mean is, if you ask a human to solve a travelling salesman problem and they find it too hard to solve exactly, they will still be able to come up with a better than average solution. This is what I called approximation (but maybe this is incorrect?).

Hallucination would be to choose a random solution and claim that it's the optimum.

1 comments

I may be misunderstanding the way LLM practitioners use the word “hallucination,” but I understood it to describe it as something different from the kind of “random” nonsense-word failures that happen, for example, when the temperature is too high [0].

Rather, I thought hallucination, in your example, might be something closer to a grizzled old salesman-map-draftsman’s folk wisdom that sounds like a plausibly optimal mapping strategy to a boss oblivious to the mathematical irreducibility of the problem. Imagining a “fact” that sounds plausible and is rhetorically useful, but that’s never been true and nobody ever said was true.

It’ll still be, like your human in the example, better than average (if “average” means averaged across the universe of all possible answers), and maybe even useful enough to convince the people reading the output, but it will be nonetheless false.

[0] e.g. https://news.ycombinator.com/item?id=39450669

If a driver is tasked with visiting a number of places, they will probably choose a reasonably good route. If the driver claims to have found the optimal route, it may not be true, but it's still not a hallucination and it's still a pretty good route.

The driver certainly cannot be relied on to always find an exact solution to an NP-complete problem. But failure modes matter. For practical purposes, the driver's solution is not simply "false". It's just suboptimal.

If we could get LLMs to fail in a similarly benign way, that would make them far more robust without disproving what the posted paper claims.

> but it will be nonetheless false.

Only if you're assuming all questions have binary answers.

For example in the traveling salesman problem you don't have to compute all answers to start converging on an average. A random sampling of solutions can start setting a bounds for average, and your grizzled salesmans guesses would fall somewhere on that plot. If they are statistically better than average then they are far more than good enough. Unless of course you think burning up the observable universe in finding the best solution is the only way to solve the problem of which trip uses the least gas?