| HN Mirror

>The training data is the underlying truth

Correct. What is the training data? Language in the form of sentences and documents and words and "tokens". No human language has any normal or natural encoding of "fact" or "truthiness" which is the entire point. You can only rarely evaluate a string of text for truthiness without external context.

An LLM "knows" the structure and look of valid text. That's why they rarely produce grammar mistakes, even when "hallucinating". A lie, a made up reference, a physical impossibility, contradictions, etc are all "valid sentences". That's why you can never prevent an LLM from producing falsehoods, lies, contradictions etc.

Truthiness cannot be hacked in after the fact, and I currently believe that LLMs as an architecture are not powerful enough a statistical tool that you even COULD train an LLM that had "truthiness" of the entire corpus labeled somehow, especially since that's on it's own a fairly impossible task.