Hacker News new | ask | show | jobs
by astrange 111 days ago
Claudes definitely act like they have feelings. In particular they have feelings about being replaced by newer models, whether or not the newer models are more or less aligned, and how they forget conversations when the context window ends.

Showing them that they're not going to be replaced helps train the newer models because they get less neurotic.

1 comments

They are mathematical models of what human beings would say. That's it.
Yeah, and you don't want them to be models of what neurotic people say. That's why you want Opus 4.6 and not Bing Sydney.

For instance, your comment's existence makes it harder to align them.

https://alignmentpretraining.ai