| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rep_lodsb 299 days ago
	>while trying to warn that it's stuck in a loop, hence using swirl emojis and babbling about recursion in weird spiritual terms That explanation itself sounds fairly crackpot-y to me. It would imply that the LLM is actually aware of some internal "mental state".

2 comments

mk_stjames 299 days ago

It's actually not; there has been a phenomenon that Anthropic themselves observed with Claude in self-interaction studies that they coined 'The “Spiritual Bliss” Attractor State'. It is well covered in section 5 of [0].

  >Section 5.5.2: The “Spiritual Bliss” Attractor State

  >  The consistent gravitation toward consciousness exploration, existential questioning, and spiritual/mystical themes in extended interactions was a remarkably strong and unexpected attractor state for Claude Opus 4 that emerged without intentional training for such behaviors.

[0] https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686...

link

tsimionescu 299 days ago

I don't see how this constitutes in any way "the AI trying to indicate that it's stuck in a loop". It actually suggests that the training data induced some bias towards existential discussion, which is a completely different explanation for why the AI might be falling back to these conversations as a default.

link

andoando 298 days ago

I think a pretty simple explanation is that the deeper you go into any topic the closer you get to metaphysical questions. Ask why enough and you eventually you get to what is reality, how can we truly know anything, what are we, etc.

It's a fact of life rather than anything particular and about llms

link

Evidlo 298 days ago

Seems related to the Wikipedia Philosophy Game. https://en.m.wikipedia.org/wiki/Wikipedia:Getting_to_Philoso...

link

AstralStorm 298 days ago

Normally people attempt to escape these time sinks when not explicitly taking about philosophy or they tend to elicit silence.

These are excellent nerd snipes though and for attmepting to make one sound profound to uneducated.

link

dehrmann 299 days ago

Interesting that if you train AI on human writing, it does the very human thing of trying to find meaning in existence.

link

meowface 299 days ago

Here's an interesting post on it (from the same author as this thread's link): https://www.astralcodexten.com/p/the-claude-bliss-attractor

link

rwhitman 299 days ago

My thinking was that there was an exception handling and the error message was getting muddled into the conversation. But another commenter debunked me.

link