Hacker News new | ask | show | jobs
by imchillyb 1055 days ago
A hallucination is an unexpected emergence.

The 'making up' facts, because it cannot determine a fact from fiction, is entirely expected behavior.

There is no 'hallucination' as the behavior is anticipated, expected, and entirely within normal operations processes.

The bullshit comes from there being no model of trust these AIs subscribe to. I'd love-love-love to see these AI producers be held to some responsibility to verification of truth and ethics.

These companies/universities/groups allowing their applications to bold-face-lie (misrepresent data with authority) to citizens should be top-priority to bash-in-the-face by legislators around the world.

3 comments

> There is no 'hallucination' as the behavior is anticipated, expected, and entirely within normal operations processes.

Exactly. These are models that predict text sequences. These sequences often semantically express falsehoods, but the model's not "lying", it's not "hallucinating", and it's definitely not malfunctioning. It's doing exactly what it was designed to do.

There definitely are "lies" and "hallucinations" here though ... but they're coming from the hype-cycle-hucksters trying to convince us that this whole process somehow resembles "intelligence".

It clearly has some level of intelligence, though it’s pretty far from human level. The hallucinations don’t make it less intelligent because it’s not “trying” to avoid them, as you seem to know already
> It clearly has some level of intelligence

Absolutely not, this is not remotely "clear", and it's a very strange thing to assert.

> The hallucinations don’t make it less intelligent because it’s not “trying” to avoid them, as you seem to know already

What? No. What does "as you seem to know already" mean in this context?

> Absolutely not, this is not remotely "clear", and it's a very strange thing to assert.

I guess it depends how you define intelligence but I guess I would say intelligence is the ability to find the best action to take to achieve a certain goal, and AI can do that reasonably well

> What does "as you seem to know already" mean in this context?

It means that based on the comment I was replying to the person seems to already understand what I just said

> It clearly has some level of intelligence

https://plato.stanford.edu/entries/chinese-room/#LargPhilIss...

To me the Chinese Room thought experiment seems like it's meant to show that AIs can be intelligent, not the opposite?

"Searle could receive Chinese characters through a slot in the door, process them according to the program's instructions, and produce Chinese characters as output, without understanding any of the content of the Chinese writing."

Sure, but that doesn't mean the state of the program doesn't contain any understanding or intelligence, it's just that the human doesn't have a high-level view that can be used to decode that internal state. We're not asking whether the computer chip itself understands things but whether the something contained in the program running on it does. The human could also run a physics simulation as in https://xkcd.com/505/ and recreate a human brain which would be no different to a physical brain in terms of behavior and so there would be no reason not to call it intelligent

You're misunderstanding the thought experiment then. By definition the person inside the Chinese Room doesn't understand Chinese.

> but that doesn't mean the state of the program doesn't contain any understanding or intelligence

Programs don't contain understanding or intelligence, they contain instructions.

> We're not asking whether the computer chip itself understands things but whether the something contained in the program running on it does.

I feel like your saying "I'm not accusing the blender of being intelligent, I'm saying the recipe for this margarita is self aware." It doesn't matter if its hardware or software, neither is capable of understanding because understanding is a conscious experience and neither a blender nor a recipe are sentient.

> The human could also run a physics simulation

Cool XKCD but I'm not arguing about wether AI is possible. Just pointing out that convolutional neural networks are not self aware or intelligent or actually learning (at least not yet).

> “You're misunderstanding the thought experiment then.”

So if I don’t agree with it, I’m misunderstanding it? It even says in the Wikipedia article for it:

> "The overwhelming majority", notes BBS editor Stevan Harnad, "still think that the Chinese Room Argument is dead wrong".

So don’t try to pretend it’s some absolute truth, it’s just a flawed argument

> Programs don't contain understanding or intelligence, they contain instructions.

Why can intelligence and understanding not come from a sufficiently complex set of instructions?

> understanding is a conscious experience and neither a blender nor a recipe are sentient.

That’s an odd definition of understanding. By my definition understanding is having information about something and the ability to process it such that you can effectively predict its behaviour and possibly take actions to change its state to fit a goal. I guess you will always win if you redefine all the words to mean what you want. Your definition is useless because it’s unfalsifiable because you can’t measure whether something is “sentient”

> Just pointing out that convolutional neural networks are not self aware or intelligent or actually learning

Self aware? Probably no

Intelligent? To some extent, yes

Learning? Of course they are, I don’t see how you can argue that they aren’t

Speaking with GPT-4, it is hard to deny the conjecture that its weights encode an internal world model somewhere.

If so, the difficulty is not that the model has no conception of truth and falsity, it is rather to motivate the model to tell the truth. Or more precisely, to let the model be honest, to only tell things it believes to be true, things which are part of its world model.

Unfortunately, we can't just tell the model to be honest, since we can't distinguish between responses the model does or does not believe to be true. With RLHF fine-tuning, we can train the model to tend to give answers the human raters believe to be true. But we want the model to tell what it believes to be true, not what it believes that we believe is true!

For example, human raters may overwhelmingly rate response X as false, but the model, having read the entire Internet, may have come to the conclusion that X is true. So RLHF would train it to lie about X, to answer not-X instead of X.

This problem could turn out to be fatal when a model becomes significantly smarter than humans, because this means it would less often believe according to human biases and misconceptions, so it would learn to be deceptive and to tell us only what we want to believe. This could have frightening consequences if this leads it to conceal any of its possible misalignments with human values from us.

It is, like you said, conjecture. The best we can say is that it _usually_ provides responses that are _consistent_ with responses coming from an intelligence with an internal world model. That doesn't mean that's the only way to get those responses, nor does it mean that this is necessarily what's happening in this case.

So saying things like "the model has come to the conclusion that" or "smarter than", or "learns to be deceptive", I think that's premature at best. I'm not yet convinced that there's sufficient evidence to show appreciable internal state and logical processes. There's so, so many examples where what looks like legit understanding breaks down with the slightest tweak to the prompt, and it goes from looking like a savant to someone high on just a tremendous amount of LSD.

If there was an internal world model that just wasn't correct, I would expect to see its incorrect answers be at least logically consistent, but instead it looks way, way more like the trick just doesn't work for this case.

So to get back to the original point, this is MS trying to leverage this trick to do a task that requires actual logical reasoning, factual evaluation, and internal world state, and we're just not there. (I hesitate to use the word "yet", because there's still a lot of not-yet-conclusive discussion around whether current LLM techniques will ever get us "there." Colour me tentatively pessimistic in the meantime. =) )

because it cannot determine a fact from fiction

This is way too narrow. Even if it were able to determine fact from fiction, a neural network would still be able to hallucinate as long as it has no ontology: if it doesn't "know" the boundary between objects it has no way of knowing the atomicity of its facts, so it will inevitably combine even known "facts" into falsehoods.

To illustrate, the following fact-based syllogism would sound perfectly valid in the absence of a working ontology:

  A: That green flask costs $10
  B: This flask is green
  => This flask costs $10
"Lies are attempts to hide the truth by willfully denying facts. Fiction, on the other hand, is an attempt to reveal the truth by ignoring facts. — John Green