| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by aguaviva 623 days ago

LLMs _will_ hallucinate no matter what you do.

I think you're overthinking it. This is like saying LLMs will never be useful because they hallucinate. That's a known issue, and yet of course they have been proven to be often quite useful nonetheless.

What it comes down to is, how often do they hallucinate, what's the negative impact when they do (both of which can be measured) and very importantly: for whatever their measured performance is, how does it compare to the next best alternative that users have?

It's not like they're trying to build a model to design a nuclear reactor in one go. It's just a Q+A bot, whose performance can be easily measured by benchmarking it against the top 30 questions or so in a given subject area (probably accounting for 95 percent of all inputs). And the current alternative users have (search engines) is pretty darn mediocre.

BTW I'm actually not much of a fan of LLMs or chatbots, so I have nothing to "sell" you here. But this is my rough take, based on my generally quite skeptical attitude toward this technology.

Which does seem to suggest that, at the very least, it's an idea worth exploring.

1 comments

az09mugen 622 days ago

I'm not overthinking it, some papers describe it [0] and [1] for example. I agree for some subjects you can deal with some error.

The problem is not how often but how bad just one single error can be. My point was on controversial topic, where a single error can deal serious damage. Yes it must be error-free like for a nuclear reactor. Just imagine a Q&A chatbot answering questions on the subject of Israel and Palestine or something else really touchy, do you really think you can afford any error/hallucination ?

[0] : https://arxiv.org/abs/2409.05746 [1] : https://arxiv.org/abs/2401.11817

link

aguaviva 622 days ago

You are indeed overthinking, because I just said, very clearly, that I acknowledged the hallucination problem, and yet you're throwing citations back as if I never heard of it. How many times does one have to say "it's a known issue"?

My point was on controversial topic, where a single error can deal serious damage.

Okay, but so can a single garbage article on a search engine result. I guess one shouldn't build search engines then (unless they can be held to the same standards as nuclear reactors), because do you think we can afford even single bad result? Just imagine what will happen, etc.

link