| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by est 1101 days ago

> so all it can say about them is pure hallucination

This "hallucination" come along a lot recently. Is it a legit concept or just "the dog ate my homework" type of excuse for anything?

I mean, does the human mind also "hallucinate" all the time? Why do we expect from an "artificial" mind to outperform us?

3 comments

masklinn 1101 days ago

> This "hallucination" come along a lot recently.

Couldn’t exactly be otherwise given how young GPT is. ChatGPT was released a bit under 7 months ago.

> Is it a legit concept or just "the dog ate my homework" type of excuse for anything?

It’s an analogy for how LLMs work. An LLM does not know anything, it just adds tokens probabilistically based on the previous tokens.

So essentially it always hallucinates (makes shit up as it goes along, if you prefer).

Thanks to the model it’s generally quite credible, and often even lines up with actual reality, but it should not be confused for knowledge.

That’s why it will confidently give you citations it just made up, to papers or decisions it’ll happily make up as well (though less and less credibly as things get closer to hard facts).

link

gpderetta 1101 days ago

> It’s an analogy for how LLMs work. An LLM does not know anything, it just adds tokens probabilistically based on the previous tokens

This seem a deep statement that keeps getting repeated, but it doesn't mean anything. The probabilistic model that is used to decide the next token could be arbitrarily complex, including encoding knowledge (or just asking a panel of experts).

It seems pretty self evident that the model in fact encodes knowledge, just in a very lossy way and recall is also flawed.

link

seba_dos1 1101 days ago

It sure does encode some knowledge, because it's a language model and languages already do so on their own. It's far from what you'd usually call a "knowledge model" though.

link

Sharlin 1101 days ago

Which is why "hallucination" is really the wrong word to use, "confabulation" would be more proper. But "hallucination" has stuck because it's the word used back when people first figured out the trick of running image classifiers "in reverse" to generate images from noise.

link

masklinn 1101 days ago

Sure but nobody knows the word “confabulation”, and lying / making things up implies intent.

So “hallucination” hews close enough to have good explanatory powers.

link

Sharlin 1101 days ago

Confabulation is unintentional, FWIW:

> In psychology, confabulation is a memory error defined as the production of fabricated, distorted, or misinterpreted memories about oneself or the world. […] Confabulation occurs when individuals mistakenly recall false information, without intending to deceive.

link

masklinn 1101 days ago

Yes, which is why I agree that it’s a better term. That’s not the issue.

link

Sharlin 1101 days ago

Ah, I misinterpreted your previous comment!

link

est 1101 days ago

> it just adds tokens probabilistically based on the previous tokens

I mean, isn't this what humans do all the time? Bullshitting random topics on the Internet, except humans tend to add disclaimers like "I am not a lawyer but" and stuff.

link

masklinn 1101 days ago

> I mean, isn't this what humans do all the time?

No? Most humans don’t randomly vomit text based on what sounds good.

> Bullshitting random topics on the Internet, except humans tend to add disclaimers like "I am not a lawyer but" and stuff.

Which shows a much higher level of understanding, both of the field (which may be flawed), and of their own understanding of the field (which they point out).

An LLM does not to that, it doesn’t just repeat potentially wrong hearsay or incorrect memories (let alone having actual understanding and knowledge of the field), it confidently writes out delusions.

link

est 1101 days ago

> Most humans don’t randomly vomit text based on what sounds good.

Unless humans were given a task? e.g. taking exams while un-prepared.

My kid usually gives me a long description of imaginary stuff based on the name only or brief intro. It's very fun when finally the real deal was revealed.

link

seba_dos1 1100 days ago

That's absolutely right. That said, people don't usually take exam output of unprepared students and expect it to be useful :)

link

number6 1101 days ago

"As a Language Model" is the new "I am not a lawyer"

link

clnq 1101 days ago

Regarding making language, I think the human mind hallucinates not unlike GPT. Humans say a lot of stuff because they feel vaguely it is true. So does an LLM when it talks about things it’s underfitted for.

Anyways, hallucination is a term in generative AI. It means that the model produces results inconsistent with their training data. Or that’s what people say, sometimes the training data is just not that good.

link

est 1101 days ago

> It means that the model produces results inconsistent with their training data. Or that’s what people say, sometimes the training data is just not that good.

If you ask a real person to put together an essay on an obscure topic without extensive research, I bet 80% of the content is made up "hallucinations"

link

seba_dos1 1101 days ago

Well, that's hardly surprising. Asking random people to put together essays on obscure topics without extensive research is a great way to produce essays as useful as ChatGPT's output ;)

link

klempner 1101 days ago

It "came along a lot recently" because everything ChatGPT is "recently" -- it only came out 6 months ago.

link