|
|
|
|
|
by cs702
96 days ago
|
|
TL;DR: The authors found current-generation AI agents are too unreliable, too untrustworthy, and too unsafe for real-world use. Quoting from the abstract: "We report an exploratory red-teaming study of autonomous language-model–powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions." "Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover." |
|
...a completely unsurprising result, but it's nice to see published experiments.
Any agent system using current LLMs is likely to exhibit undesirable traits that derive from the training data.