| HN Mirror

That's mostly intuitive.

An accurate answer is often driven by a concrete and highly confident fact in the training dataset (e.g. structured data fact, like a birth date from Wikipedia etc.).

The hallucinations are derived facts of (hopefully) low confidence. Nondeterminism is more common if you have low scores. Only a few facts can take high score (in a usable system), while many can take a low score -- then numeric instability can make a mess.

I'm not very familiar with LLMs, but I do have experience with the traditional ML models and content understanding production system. But, LLMs are not far from them.