| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by amelius 374 days ago

No, they only understand it on a superficial level. The behavior of these systems emerges from simpler stuff, yes, but the end result is difficult to reason about. Just have a look at Claude's prompt [1] that leaked some time ago, and which is an almost desperate attempt of the creators to nudge the system into a certain direction and make it not say the wrong things.

We probably need a New Kind of Soft Science™ to fill this gap.

[1] https://simonwillison.net/2025/May/25/claude-4-system-prompt...