| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by energy123 115 days ago

This guy sounds like an uninformed jackass.

Look at Gemini 3.1 Pro on the AA-Omniscience Index, which measures hallucinations. It's 30, previous best was 11.

https://artificialanalysis.ai/evaluations/omniscience

With the amount of talent working on this problem, you would be unwise to bet against it being solved, for any reasonable definition of solved.

1 comments

disgruntledphd2 115 days ago

> With the amount of talent working on this problem, you would be unwise to bet against it being solved, for any reasonable definition of solved.

I'm honestly not sure how this issue could be solved. Like, fundamentally LLMs are next (or N-forward) token predictors. They don't have any way (in and of themselves) to ground their token generations, and given that token N is dependent on all of tokens (1...n-1) then small discrepancies can easily spiral out of control.

link

simianwords 115 days ago

To solve it doesn't mean we have to eliminate it completely. I think GPT has solved it to enough extent that it is reliable. You can't get it to easily hallucinate.

link

disgruntledphd2 115 days ago

It depends on how much context is in the training data. I find that they make stuff up more in places where there isn't enough context (so more often in internal $work stuff).

link