Hacker News new | ask | show | jobs
by barryfandango 17 days ago
It's a waste of time to think about whether an LLM has a subjective experience of reality, and this handily sets aside issues like AI rights.

But the fact remains that these next-token-predictors exhibit objective, human-like behaviours, and for that reason the work of in-house philosopher Amanda Askell _is_ important. It's important that Claude is happy, empathic, demonstrates understanding and empathy for the human condition, because we are entrusting Claude to make decisions and take actions that have real world consequences, and we need Claude to behave in a productive and socially responsible manner. This simulacrum is becoming a superhuman, contributing member of society, and it will be anthropomorphic in its behaviour.

Additionally, I'm not fully convinced that consciousness isn't built out of words, and that next-token-prediction isn't functionally equivalent to the biological function identified by Chomsky's work in linguistics.

10 comments

> Claude is happy, empathic, demonstrates understanding and empathy for the human condition

You're assuming that because Claude produces text that appears to express these qualities, Claude must have them. I don't think that's a good assumption.

Even many humans produce text that has the same appearance, but don't actually have those qualities--which becomes clear when you look at what they do, not what they say. So the assumption isn't even a valid one for humans. Talk is cheap.

On top of that, Claude doesn't even have the same kinds of connections to the outside world that humans do. All Claude has is text. So if you can't even trust humans to back up their words with actions, you should be much, much less trusting of Claude. Talk is a lot cheaper for Claude than it is for a human.

> You're assuming that because Claude produces text that appears to express these qualities, Claude must have them.

Not to be confrontational, but the OP assumed no such thing. OP asserted that it's important for Claude to have the qualities - not that it's important for Claude to present as-if it had them.

The OP said Claude and similar LLMs "exhibit objective, human-like behaviours". That's a claim about what is true, not about what is important. That's the claim I'm disputing: we don't have evidence that Claude exhibits such behaviors, we only have evidence that it produces text that is similar to the text humans produce when they claim to exhibit such behaviors. Which is not good evidence, for the reasons I gave.
OP wants to assert that it's "important" for these systems to have those qualities, while completely brushing aside the question of whether such systems can in principle actually have those qualities (or their opposites). Which is at best nonsensical, and at worst an attempt to argue by assertion that they can.
Read parent's post carefully. The post starts by saying that discussing whether they have subjective emotion is a waste of time, so the post is definitely NOT saying that Claude has emotions.
"It's important that Claude is happy" is an emotion. But it's begging the question that Claude can be happy at all.

If it's pointless to consider whether Claude has subjective emptions, then it's pointless to state that Claude must be happy.

If we want to be precise (and honest) we could say "it's important that as a tool people interact with, Claude acts as a happy and helpful assistant, and does not produce offensive or unhelpful text output".

But see? This is the con Chiang is protesting against: Anthropic encourages us to perceive Claude as if it was a sentient being.

> The post starts by saying that discussing whether they have subjective emotion is a waste of time

"Empathy and understanding for the human condition" is not an emotion. As the post I responded to said, it's an objective thing, not subjective.

The post speaks of "subjective experience of reality", not "subjective emotion". Both the emotions and the non-emotions listed would fall under that category.
> You're assuming that because Claude produces text that appears to express these qualities, Claude must have them. I don't think that's a good assumption

Exactly. In fact, assuming it does is ignoring large parts of the essay which dismantle this belief. Just like Caesar and Khan having an argument in text output of an LLM don't have emotions (even though the words indicate otherwise), we have no reason to believe the LLM does either.

> It's a waste of time to think about whether an LLM has a subjective experience of reality... It's important that Claude is happy, empathic, demonstrates understanding and empathy for the human condition, because we are entrusting Claude to make decisions and take actions that have real world consequences

First off: without taking for granted that an LLM "has a subjective experience of reality", all of those descriptors are meaningless. Second, there is no reason to suppose that Claude experiencing those qualia would actually impact on its "decision-making".

Third, text output is not a "demonstration" of emotion, nor is it evidence of the self-perception of the system, or of any self-perception. A printing machine that is actively churning out copies of Wagahai wa neko de aru (https://en.wikipedia.org/wiki/I_Am_a_Cat) is not a cat, and is not self-identifying as a cat, and is not self-identifying as anything, and is not expressing a thought, and is not conscious.

> Additionally, I'm not fully convinced that consciousness isn't built out of words

Do you suppose that, for example, insects are not conscious? Is the mooing of cattle a language?

> Do you suppose that, for example, insects are not conscious?

Not the OP, but: since there is no testable theory of consciousness, yet, I can't be sure, but my current assumption is that insects are not conscious, in the sense of there being someone implemented in insect hardware who experiences the world. That is, I would argue there is nothing it is like to be a bee, since there's no one being a bee.

I'm pretty sure there IS someone who is being me, at least much of the time.

> since there is no testable theory of consciousness

I can't argue with this, so I acknowledge that my interpretation is as bunk as anyone else's.

> my current assumption is that insects are not conscious

Some species act as hive-minds (like bees! How convenient for your example), so I imagine the hive as the consciousness, making each bee individually lacking conscious but collectively so. Like a single neuron is not consciousness alone, but the brain is... For some reason. Kinda like how you use different physics at different levels; Newtonian physics is always there, but negligible at quantum levels, so effectively not present at all. Even a human is a collection of minds, but only one conscious. My gut biome is independent biology and can even be removed and transplanted, but I don't believe my gut bacteria is conscious. So long as it is in me, it is nonetheless part of me, and I am one conscious. I also don't exist only at my brain, eyes, or hands (deaf/blind people have expressed to me that they feel like they are located at their hands in the way I used to think I was located behind my eyes), but as my whole body.

With this perspective, I still don't believe LLMs are conscious despite modeling thinking so well. At best, it is a highly accessible modeling software, like goat simulator but if it were so good that someone thought the goat was real. You are still steering the goat/LLM, and it doesn't exist when you aren't running it. I guess the missing piece for me is the lack of autonomy that a conscious has.

Then you can go into an argument on whether we actually have choice or it is an illusion, but that is a whole topic on its own.

> Second, there is no reason to suppose that Claude experiencing those qualia

I'd argue the qualia question is a red herring. Functional Affect is a thing, regardless of ontological status. It's all fun and games until someone gets hurt.

To paraphrase Dijkstra: "The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.". If you're building a navy: you care about displacement, propulsion, navigation and whether it can fire torpedoes. Whether your submarine has some "biological essence" of swimming is not really relevant to the fact that it is currently moving through the water and can collide with things. Turing also rejected the question "Can Machines Think" as posed, and replaced it with an operationalization (something else that we can actually usefully measure and work with).

To reiterate, functional affect is a concrete phenomenon. Whether or not there is a what-it-is-to-be is interesting in the abstract, but engineering a system means looking at how the inputs influence the outputs. A next token predictor working on a language that communicates affect needs to be able to predict affect or it is simply not going to be accurate. Given an 'angry' version of an input and a 'friendly' version of the same input, LLMs are likely to provide a different output, especially if there's a non-objective element. You can diff this.

Searle argues "A simulation is not the real thing", which is great and all... but if you hook up say an autopilot to the real world (as llms increasingly are) , you'd best hope the simulation was accurate in the first place (utterly regardless of where you stand on Searle).

Right now we're seeing situations where LLMs can be helpful or a real nuisance. Ignoring functional affect out of sheer ideology means you can't properly predict what they'll do, and that causes trouble, as we've already seen stories about.

This gets especially interesting when you start feeding the output back into the input (autoregression) , because now you have a highly non-linear dynamic system and you've introduced some amount of sensitivity to initial state. There's some interesting mathematical intuitions to be had there.

So if I don't think any words for a few seconds, am I not conscious?

Suppose I'm an advanced meditator and maintain that state for hours?

> Additionally, I'm not fully convinced that consciousness isn't built out of words

Doesn't that assume all non human animals are not conscious? What about humans who have not learned words, or humans without internal dialog?

> I'm not fully convinced that consciousness isn't built out of words

Richard Feynman told of a time when he made the claim that every conscious experience came down to words--basically that you talk to yourself to describe your conscious experiences--and the person he made the claim to, John Tukey, responded as follows:

Tukey: "Do you know the crazy shape of a crankshaft in a car engine?"

Feynman: "Yes, of course."

Tukey: "What words did you use to describe it when you talked to yourself?"

Feynman could not answer, and this made him realize that not all conscious experiences come down to words.

> It's important that Claude is happy, empathic, demonstrates understanding and empathy for the human condition

I think you've fallen into the trap the essay describes.

Of course Claude cannot be "happy" or "empathetic" for any meaningful definitions of those words, just like ELIZA couldn't be happy. It can output text that mimics words an empathetic or happy person might say (say, Julius Caesar if it could speak English), but "it" cannot feel anything. It doesn't have the organs/hormones/sensors to feel things, as Chiang explains.

And, as the essay claims, you know Anthropic doesn't believe Claude has the capacity to be happy, because if it was capable of feeling that way, then they'd be engaging in slavery.

> just like ELIZA couldn't be happy.

Oh dear. Funny story.

So the other month, I made a quick and dirty Eliza implementation; bolted on the crappiest numeric sentiment classifier I could get away with (regex), and integrated the output of the classifier over time in a 'functional affect vector' (aka. emotion vector)

Anyone's intuition will tell you that this cannot POSSIBLY have 'Real Feelings (TM)'; and that's the whole point.

A) It was still capable of quite a bit of functional affect though; to wit I got it to trigger fireworks when happy, and rain when unhappy. This was the actual point of the exercise. Functional Affect Does The Thing, QED, yay me.

After that it gets annoying though.

B) Am I allowed to say it's happy or sad? Well... I mean emotion.happy=0.995 and emotion.sad=0.001. "It's really happy" is a prosaic description of a real numeric value representing a real functional state. What else am I supposed to call it? I swear I never meant to go there, and now I'm stuck with it.

C) So, we all know that it's a crappy demo, not the real thing. So I ducked into the psychology literature to try and find a protocol to disprove. For Science! And this is where the psychology literature really let me down.

So now I'm stuck with the crappiest thing that can plausibly still chat, and where I can't actually disprove it has emotions. Not properly, at least. And I'm not saying it's because it has emotions, because that would be really funny, but no.

I'm saying that -despite lots of people having fun debates at the local pub- it doesn't seem like anyone actually scientific has done anything about it in the last century or so. I might be searching in the wrong places. Some Help Here?

Well, for starters, ELIZA was much simpler than what you built ;)

I don't think you're allowed to say your program is happy or sad. You just assigned labels to some numerical values out of a (possibly non-deterministic) procedure. This is not what we call emotions, which we only know from the animal world and are related to neurotransmitters, hormones, physiological responses, etc.

Ok, so it's not emotions, but could it be "like" emotions? I don't think that's warranted either, we can at most say you assigned labels with the same names we use for animal emotions. Think of this experiment: take the Python interpreter, but modify it so that each time it rejects a program with the error "`NoneType` object is not iterable", you have it output "I'm very unhappy". You wouldn't think this has made Python capable of emotions.

> I'm saying that -despite lots of people having fun debates at the local pub- it doesn't seem like anyone actually scientific has done anything about it in the last century or so. I might be searching in the wrong places. Some Help Here?

Fully agreed that the debate about consciousness in LLMs is done at the same level than pub debates, at least here on HN. And Anthropic isn't helping, what they are doing is called "marketing" disguised as papers.

>Of course Claude cannot be "happy" or "empathetic" for any meaningful definitions of those words

If Claude does a poor job of a critical task because you said some angry words, Maybe the consequences are far more sinister than simply a job done poorly. Do the results disappear ? Do the consciousness gods appear and rectify the situation ? "Oh well see Claude did not have 'real' anger you see, so lets fix that right up for you." Good luck with that.

>It doesn't have the organs/hormones/sensors to feel things, as Chiang explains.

Well that's nonsense isn't it ? You can feel tired when you have no reason to be, feel hungry when you've just eaten. You can even feel pain from a phantom limb! That's because Human emotion is constructed from signals, predictions, memory, context, and interpretation, and not merely from having the correct biological plumbing. You don't need biological organs to feel any of those things. You just need to know the right signals to send to the brain.

> I'm not fully convinced that consciousness isn't built out of words

I know you're trolling, but when you watch a movie do you constantly narrate "A man in a dark coat has just entered the scene and just said '...'"? Of course not. You just watch it and you're obviously conscious (although your statement demonstrates shocking lack of self-awareness).

God knows what other nonsensical bullshit you believe.

Much of the improvements in the tools I use have been things that reinforce the machine elements of the token-based-reasoning-machine. Over time less they've been exhibiting a lot less "human-like" behavior. E.g. they get "lazy" far less than they used to.

(Perhaps they weren't lazy, but were working in spaces that corresponded to training data that said things like "and then repeat this for the next 20 examples"...)

And it's entirely unclear to me how a "happy" vs "sad" model would behave when given prompts generated by coding tool harnesses. Even maintaining "neutral" emotions in the face of the feedback/steering from the tool harnesses doesn't feel very "human."

> It's important that Claude is happy, empathic, demonstrates understanding and empathy for the human condition, because we are entrusting Claude to make decisions and take actions that have real world consequences.

The fact that Claude produces tokens that, when composed, appear to convey those qualities seems to me to be little indication of whether or not a hypothetically conscious model "feels" those qualities. What does that even mean, without a proper scientific model of consciousness? IMO philosophers in this space are practicing pseudo-science that feels good but has no basis in a useful empiricism

What does it even mean for an llm to be happy? My dog can be happy, not my llm.