Hacker News new | ask | show | jobs
by sensanaty 643 days ago
I hate that the AI pundits have succeeded in popularizing the notion of "hallucination", anthropomorphizing these balls of statistics into something that seems like it's actually in some sort of deep thought process akin to a person's mind.

No, it's not "hallucinating". It's not lying, or making things up, or anything like that either. It's spitting out data according to what triggers the underlying weights. If this were a regular JSON API endpoint, you wouldn't say the API is hallucinating, you'd say "This API is shit" because it's broken.

9 comments

Do we really need to have this discussion in every thread about LLMs?
As long as AI-bros are pushing for making AI models seem like more than they are to pad their wallets, there'll be someone like me pointing out that, no, it's not "hallucinating", it's spitting bad data.
You're being pedantic. Your statement that "it's spitting bad data" is incorrect too, as it implies agency. Actually, nothing is happening but electrons flowing. The notion of an "it" that "spits" "data" which is "bad" is your own conceptual overlay.
Tbf, if you assume humans have agency, there’s plenty of people who would claim you’re making the same mistake because the reductionist view is that people are just either deterministic chemical soup (or maybe with a bit of randomness baked in).
I know lots of people working on AI. they are among the least bro-y group of people I have ever met.

There is simply nothing similar to actual bro-y finance culture among AI research engineers. It is entirely a figment of the media and backreaction that we currently have to portray everyone we don’t like as a “bro” - truth be damned.

no - the cliques are different but linked at the hip. Add international finance, too.. India, China and others.
whatever your information diet is, i recommend you change it
> I hate that the AI pundits have succeeded in popularizing the notion of "hallucination", anthropomorphizing these balls of statistics into something that seems like it's actually in some sort of deep thought process akin to a person's mind.

I'd argue the opposite: people think a person's mind is in "deep thought" when it's actually just a ball of statistics.

Do you think that an LLM would spit out Latin and English if you trained it with homo sapiens mumbling?

Yet, humans managed to do that (albeit over many generations)

Ergo, humans are not just balls of statistics

Not intended to be snarky, but what would you consider them? Is it akin to a function in the mathematical sense, that takes (sensory) input and creates output based on that? If so, how does this function work, if not by statistics? I am genuinely interested in your point of view. Also: Don't you think humans can be somewhat compared to a "pretrained model", as in human genetics gives the brain a head start, so that it can start speaking latin from what you deam "homo sapiens mumbling?
Not a specialist, but I think each individual is just a small step of gradient descent for the large neural network of humankind.

At our individual scale, we look like a rigid ball of statistics, but at the global scale, we carry a small amount of gradient/delta that pushes humankind in a broader direction.

LLMs have been able to reproduce the former, it is unclear how they can contribute to/replicate the latter.

The right word is "confabulation". Which is when we fill in missing information but may not be aware that we are doing it.

We all confabulate to some degree, as any neural system must, since no training data is stored perfectly.

Human "hallucinations" in contrast, are a particular kind of breakdown in our sensory feedback loops. Which is not a process LLMs even have.

Hallucinations occur when our internal sensory feedback loops overpower actual sensory input, resulting in a stream of false sensory experience/signals being generated and processed. The false running experience might still incorporate some actual sensory information or not.

When we dream, we are hallucinating - our sensory experience loop running free of our actual senses - to a productive purpose.

The reason our senses have feedback is so that we can use our interpretation of sensory input as cues to make interpreting the next moments input easier. But its important that our running interpretation can reset when new input significantly diverges from our expectations so it can quickly reorient.

(Not only is it important to revert to a raw input interpretation to ensure our running interpretation keeps up the actual context changes and corrects misinterpretations, but such resets signal that something novel or unexpected has happened, so likely trigger learning.)

So "hallucinations" was an unfortunate and misleading choice of terminology.

I've got bad news for you – that term was used in deep learning research well before LLMs came on the scene. It has nothing to do with pundits trying to popularize anything or trying to justify LLMs' shortcomings, it was just a label researchers gave to a phenomenon they were trying to study.

A couple papers that use it in this way prior to LLMs:

- 2021: The Curious Case of Hallucinations in Neural Machine Translation (https://arxiv.org/abs/2104.06683)

- 2019: Identifying Fluently Inadequate Output in Neural and Statistical Machine Translation (https://aclanthology.org/W19-6623/)

can we make a siloed version of HN for your political faction? it’s tiresome reading these in every thread
Maybe an evolutionary / structuralist lens is helpful here: terms that rapidly diffuse through discourse are those that people like most, and most people like to anthropomorphize, so "hallucination" has come to take on a new meaning, and we all (to different degrees) know what it is referring to.
Give it a rest. Everything is statistics.

Sees space shuttle "pff, it's just a pile of engineering."

Yeah it's simply model error. All models from Linear Regression to LLMs have error. I guess because this type of error is in the form of deceptively reasonable human language, it gets a different moniker. It's also notably harder to quantify so it might warrant a different name.
do you really want to have a discussion about 'thought' and 'mind'? i don't