Hacker News new | ask | show | jobs
by amelius 374 days ago
No, they only understand it on a superficial level. The behavior of these systems emerges from simpler stuff, yes, but the end result is difficult to reason about. Just have a look at Claude's prompt [1] that leaked some time ago, and which is an almost desperate attempt of the creators to nudge the system into a certain direction and make it not say the wrong things.

We probably need a New Kind of Soft Scienceā„¢ to fill this gap.

[1] https://simonwillison.net/2025/May/25/claude-4-system-prompt...