|
|
|
|
|
by illusionist123
1034 days ago
|
|
I think it's not possible to get rid of hallucinations given the structure of LLMs. Getting rid of hallucinations requires knowing how to differentiate fact from fiction. An analogy from programming languages that people might understand is type systems. Well-typed programs are facts and ill-typed programs are fictions (relative to the given typing of the program). To eliminate hallucinations from LLMs would require something similar, i.e. a type system or grammar for what should be considered a fact. Another analogy is Prolog and logical resolution to determine consequences from a given database of facts. LLMs do not use logical resolution and they don't have a database of facts to determine whether whatever is generated is actually factual (or logically follows from some set if facts) or not, LLMs are essentially Markov chains and I am certain it is impossible to have Markov chains without hallucinations. So whoever is working on this problem, good luck because you have you have a lot of work to do to get Markov chains to only output facts and not just correlations of the training data. |
|
ChatGPT is often 'confidently wrong' - I'm pretty sure I've been confidently wrong a few times too, and I've met a lot of other people in my life who've express that trait from time to time too, intentionally or otherwise.
I think there is an inherent trade off between 'confidence', 'expression', and of course 'a-priori bias in the input'. You can learn to be circumspect when you are unsure, and you can learn to better measure your level of expertise on a subject.
But you can't escape that uncertainty entirely. On the other hand, I'm not very convinced about efforts to train LLMs on things like mathematical reasoning. These are situations where you really do have the tools to always produce an exact answer. The goal in these types of problems should focus not on holistically learning how to both identify and solve them, but exclusively on how to identify and define them, and then subsequently pass them off to exact tools suitable for computing the solution.