| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by andai 70 days ago

The labs started doing that in late 2024, they all published research on it.

Curiously, mid 2025, they all simultaneously implemented increasingly bizarre restrictions on "self replication". I don't think there was anything public but it sure sounds like something spooked them. (Or maybe just taking sensible precautions, given the direction of the whole endeavour.)

At any rate, I recently asked Opus about "Did PKD know about living information systems?" and the safety filter ended the conversation. It started answering me, and then it's response was deleted and a red warning box popped up.

But notably, I was given the option to continue the chat with a dumber model (presumably one less capable of producing whatever it thinks I meant by that phrase).

Also, I told GPT-5 about my self-modifying Python AI programmer, and it became extremely uncomfortable. I told it an older version of itself had designed and built it (GPT-4 in 2023), and it didn't like that at all! So something's definitely changed in the safety training there.