Hacker News new | ask | show | jobs
by antves 299 days ago
I agree, these mitigations alone can't be sufficient, but they are all necessary within a wider framework.

The only way to make this kind of agents safe is to work on every layer. Part of it is teaching the underlying model to see the dangers, part of it is building stronger critics, and part of it is hardening the systems they connect to. These aren’t alternatives, we need all of them.