Y
Hacker News
new
|
ask
|
show
|
jobs
by
mschoening
503 days ago
See Sleeper Agents (
https://arxiv.org/abs/2401.05566
).
1 comments
cosmojg
503 days ago
Who in their right mind is going to blindly take the code output by a large language model and toss it on a cruise missile? Sleeper agents are trivially circumvented by even a modicum of human oversight.
link