Hacker News new | ask | show | jobs
by m-i-l 1205 days ago
I clicked on the article hoping to find information on potential new cognitive skills humans will need to learn to differentiate Large Language Model (LLM) hallucinated facts from real facts, but unfortunately the article doesn't touch upon this.

Reading the comments it seems likely that the article is itself LLM-generated blogspam, in which case it won't be aware of the potential for hallucinated facts.

I was thinking the other day that we really need a new term for this. In 2016 we had "post-truth", but that implies humans deliberately making stuff up to deceive people, for whatever reason, but LLMs making stuff up don't really knowingly do so, and don't really have a motive. There is the term "consensus reality", but the danger is that with more and more LLM-generated content appearing on the internet, which may ultimately pollute future training, we may find "consensus" isn't sufficient to determine reality any more. Perhaps the new term for what we're heading towards could be something like the "post-reality" era, or something like that.

Not sure what the solution to this is either, other than withdrawing from the mainstream internet and sticking to the small known pockets of human resistance (while they still exist).

6 comments

Perhaps instead of using the word hallucination we could use average-context. When the LLM doesn't have enough information it computes some average of the information available in the wrapping context, so hallucination is some form of wrong result because of computing averages in a context. But the context could also be wrong.
Have there been papers published about what happens to the user experience when average-context is tightly constrained to small weighting ranges or eliminated altogether, and the model just throws an "insufficient data" error?
It handles human grammar by averaging and assuming contexts so you can't really fix one side without hurting the other. Humans separates grammar from facts, but these language models don't, to them grammar and facts are the same thing so you can't just tell it to stop lying without it stopping to do all the grammar tricks we expect from it.
> I was thinking the other day that we really need a new term for this. In 2016 we had "post-truth", but that implies humans deliberately making stuff up to deceive people, for whatever reason, but LLMs making stuff up don't really knowingly do so, and don't really have a motive. There is the term "consensus reality", but the danger is that with more and more LLM-generated content appearing on the internet, which may ultimately pollute future training, we may find "consensus" isn't sufficient to determine reality any more. Perhaps the new term for what we're heading towards could be something like the "post-reality" era, or something like that.

Really this is just postmodernism; the general collapse in epistemic certainty leading to viewing reality purely in terms of text. "Il n’y a pas de hors-texte" (Derrida); there is nothing outside the text. GPT would have to agree with Derrida, because it knows nothing but text. It has "all" the text, or at least all the text that could be found and fed to it, but nothing outside that.

(and likewise it really accelerates "Sokal hoax" questions!)

You're absolutely correct on emphasizing the text medium. Neurolinguistic programming isn't just an arcane meme for academics. The most important part of it, it only works if the user cannot separate reality from words.

Marshall McLuhan is really the techno-prophet of our era.

I think noise-era is most apt. Over time people/consumers will lean into a signal vs noise approach to using the internet instead of a whatever-is-put-infront-of-me-by-an-algo-but-call-it-'discovery'.
since /r/confidentlyincorrect exists, I propose to call them ci-claims :D
“confabulations” seems to fit.
The LLM-generated misinformation isn’t any different than the misinformation we had before ChatGPT. (Perhaps it’s worded a bit better in some cases.)

We have the same sources for truth we’ve always had. Trusted sources. Trusted sources for textbooks, articles, etc..