Hacker News new | ask | show | jobs
by kergonath 282 days ago
That is completely different from the models spying on the users, which is what is discussed here.
1 comments

as a vector. Train the model to start injecting backdoors past a certain date.

>Simple probes can catch sleeper agents

https://www.anthropic.com/research/probes-catch-sleeper-agen...