Hacker News new | ask | show | jobs
by 0cf8612b2e1e 99 days ago
You mean like the section which goes into the threat model?

  The Security Model: Design for Distrust

  I wrote about this in Don’t Trust AI Agents: when you’re building with AI agents, they should be treated as untrusted and potentially malicious. Prompt injection, model misbehavior, things nobody’s thought of yet. The right approach is architecture that assumes agents will misbehave and contains the damage when they do…
1 comments

Don‘t you see the contradiction?

I don’t trust the agent so I sandbox it before I gave it the access data to my mail and bank accounts

To be fair, if you can firewall the whole thing and have a read-only data layer, this could work for some tasks. This could get tricky when it comes to accessing web resources, but the data layer could handle it presumably. The data layer too will need to be sandboxed, I guess, in case it were to download malware.
Correct