| > While the hallucination problem in LLMs is inevitable Oh, please. That's the same old computability argument used to claim that program verification is impossible. Computability isn't the problem. LLMs are forced to a reply, regardless of the quality of the reply. If "Confidence level is too low for a reply" is an option, the argument in that paper becomes invalid. The trouble is that we don't know how to get a confidence metric out of an LLM. This is the underlying problem behind hallucinations. As I've said before, if somebody doesn't crack that problem soon, the AI industry is overvalued. Alibaba's QwQ [1] supposedly is better at reporting when it doesn't know something. Comments on that? This article is really an ad for Kapa, which seems to offer managed AI as a service, or something like that. They hang various checkers and accessories on an LLM to try to catch bogus outputs. That's a patch, not a fix. [1] https://techcrunch.com/2024/11/27/alibaba-releases-an-open-c... |
You can make improvements, as your parent comment already said, but it's not a problem which can be solved, only to some degree be reduced.