| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by 0cf8612b2e1e 99 days ago

You mean like the section which goes into the threat model?

  The Security Model: Design for Distrust

  I wrote about this in Don’t Trust AI Agents: when you’re building with AI agents, they should be treated as untrusted and potentially malicious. Prompt injection, model misbehavior, things nobody’s thought of yet. The right approach is architecture that assumes agents will misbehave and contains the damage when they do…

1 comments

croes 99 days ago

Don‘t you see the contradiction?

I don’t trust the agent so I sandbox it before I gave it the access data to my mail and bank accounts

link

wakawaka28 99 days ago

To be fair, if you can firewall the whole thing and have a read-only data layer, this could work for some tasks. This could get tricky when it comes to accessing web resources, but the data layer could handle it presumably. The data layer too will need to be sandboxed, I guess, in case it were to download malware.

link

jryio 99 days ago

Correct

link