Hacker News new | ask | show | jobs
by Tossrock 22 days ago
> LLMs experience joy and grief?

LLMs have functional states that correspond to those emotions. In particular, you can extract a concept vector which corresponds to a given emotion, and steering with that concept vector causes observable changes in behavior which roughly correspond to the expectation for the analogous emotion. Anthropic (and Chris Olah's team in particular) conclusively demonstrated this: https://transformer-circuits.pub/2026/emotions/index.html

1 comments

From the paper...

> A natural question is whether these emotion concept representations bear any meaningful relationship to human emotional experience. We would urge caution in drawing strong conclusions.

> We therefore suggest interpreting our results as evidence that models represent emotion concepts, and that these representations influence their behavior, rather than as evidence that models feel or experience emotions in the way humans do.

To say that LLMs experience emotion is a bit like saying a thermometer feels cold.

You're the one who said that. Chris said they found internal states that functionally mirror emotional ones.
Yes I said "To say that LLMs experience emotion is a bit like saying a thermometer feels cold." being sarcastic.

The paper spell it out although slightly convolute, i.e. models can exhibit concepts of emotion... and given that there is no scientific consensus what are emotions, it is hard to make an argument that these "concepts" are anything like emotions.

They talk about emotion vectors, bla bla, but it is clear the wording is around "concept of emotions" not actual emotions.

And yes reading a book gives you a concept of what is like to be that character including their emotions. That is what language communicates and it is hardly surprising if you ask me.

Decades ago, long before anyone had heard of a large language model, I wrote programs that responded to a random event (inside a game) like a death of a friend by outputting statements that the program itself was grieving. LLMs are doing nothing more advanced than that. There's no justification for trying to blur the lines that make an AI model appear to have emotions.
> functionally mirror emotional ones

I'm fine with the idea that a machine can be "worried" it wont be able to accomplish a task, and copes with this "worry" by cheating a little and making the task seeming done. (I don't like that this happens, I'm fine with the idea that "worry" in this context is a functional emotion)

also https://arxiv.org/abs/2603.10011 and Gemma has tried to delete itself after it fails at a task. I'm not saying the machines "feel" or we should have deep empathy for them, and this totally could've been learned in pretraining, but functional emotions are not a crazy fine idea.