Hacker News new | ask | show | jobs
by linkjuice4all 847 days ago
Any recommendations? The public seems to actually understand what this means although it’s just more anthropomorphization of a random bullshit generator.
5 comments

How about you call them what they are:

Bugs, Defects and "not fit for production".

How about we stop with all the Nonsense around calling it "temperature" like it's a sick baby and call it RAND cause that's what it is.

The PT Barnum levels of bullshit around ML (see we have a term that isnt using artificial or intelligence) has gotten old. Sam Altman is the next Elizabeth Holmes.

</rant>

I came here to suggest the same thing. This "hallucination" soft euphemism seems to be the tech press's way to continue to write positively about defective AI software while lightheartedly joking about how it sometimes does an oopsie.

If I ask a software to write about a well known fact or historical event and it just makes stuff up, it's not simply hallucinating. It's defective.

The thing is, it isn't a defect. People misunderstand that there is no practical difference between a "hallucinated" result and a real one, as far as an LLM is concerned. It doesn't reason or calculate beyond matching tokens, it has no deeper contextual understanding of truth or correctness beyond statistical likelihood. Hallucinations are the result of the LLM doing exactly what it's designed to do, exactly the way its designed to do it.

The defect isn't in the software, but in people expecting these things to operate the way AIs in sci-fi do, or who believe that because they can produce coherent results in natural language, they must be sentient and self-aware.

It's a defect from the point of view of user expectations. When Intel's floating point bug was in the news, I remember a small number of people claiming it was not a defect because the chip was just doing what it was designed to do: Yea, it was designed in such a way that it could produce incorrect results. In other words a bug!

I'm sure AI companies will get very good at explaining away these defects with various forms of "aCkShUaLlY" but when your marketing materials say you made a box that takes a prompt and answers it, and it answers incorrectly, what else is it than a defect?

The problem, in that case, exists between the keyboard and chair.

Floating point math is inherently inaccurate, and no programmer using it would expect perfect precision and call it a defect not to get it. You have to understand how floating point works and take that inaccuracy into account. As a result there are some applications for which using floats is simply a bad idea. No one sane is doing real money calculations with floats.

The same goes for LLMs. Hallucination is fundamental to the model. We're going to have to realize that there are many tasks for which AI simply isn't well suited. And we're going to have to get over this persistent delusion that humans are categorically worse than AI at everything. A paralegal doing research would probably not simply fabricate cases and cites whole cloth. That's not how most humans work. Humans are capable of knowing when they don't know something, AI is not.

But we've decided, for whatever reason, that AI is perfectly trustworthy. That's going to keep biting us in the ass until we learn.

I don't think your memory of the Intel bug[1] is correct. It had nothing to do with the inherent precision problems of floating point representation.

EDIT: Another way to put it: If I sold a calculator that claimed to do math, but in the fine print I said "Actually, it just makes up answers by some means we don't fully understand, and somehow most of the time it comes up with the right answer." That doesn't mean that incorrect answers are suddenly not defects.

1: https://en.wikipedia.org/wiki/Pentium_FDIV_bug

Fit for purpose... most of the time, except when it isnt then Oops... Lets color in the failure with a human term "hallucination" cause "we can't really fix it".

Sugar coating the fact that it is defective (defined: imperfect or faulty.) isnt changing things.

Your explanation is correct, it's defective by design.

LLMs are hallucinating machines. They never not hallucinate. Coincidentally, sometimes they hallucinate something true.
This is exactly why we shouldn't call it a hallucination when the AI outputs false statements.

Saying it hallucinated is just a tautology.

I forget where I originally heard this idea, but I always explain to people that LLMs are (affectionately) "bullshitters." Terms like "lying" or "hallucinating" imply that it's trying to tell the truth, but actually it doesn't care if what it says is true or not at all save for the fact that true text is slightly more plausible than false text.
Instead of ‘hallucinations’, try ‘samplings from the model that happen not to be sufficiently reminiscent of reality’. Of course, it’s a little bit less catchy. But that’s the problem with catchiness — it sticks regardless of its truth.

The fact that ‘correct’ outputs are treated as if they’re the product of an in-any-way-different process to the ‘hallucinated’ ones is the problem.

> The fact that ‘correct’ outputs are treated as if they’re the product of an in-any-way different process to the ‘hallucinated’ ones is the problem.

Also this particular context just makes it easier to notice, compared a 5000 word generated coherent-word-salad that equally wrong, but across the 5000 words.

call it what it is: random bullshit.
I guess that's fewer syllables than "hallucination."

I'm not sure how I feel about the term "hallucination" as it's applied to AI. Since you seem strongly opposed to it, let me ask you this long-winded half-question:

People understand computer things by creating analogies to the physical world - just look at the "Desktop" motif. "Folders" and "Files" too, for that matter. It seems to me that anthropomorphization would fit under that umbrella, though you may disagree. How do you feel about computer anthropomorphization in general? Is there something about "hallucination" that's particularly offensive?

Well, both the true and false outputs are equally random bullshit from a machine. "Hallucination" is just the word that caught on to describe random bullshit outputs that are false.
Even "bullshit" implies something like a mind, with intent to deceive. It should be more like noise, aberrations, incorrectly extrapolated filler material.
Bullshit is not deception. There is no intent to convince. Bullshit is even less than that. Bullshit is just blowing hot hair to create a buffer between reality and it's consequences.
I always thought of hallucinations/dreams as "random bullshit in my brain"