|
|
|
|
|
by johnsmith1840
129 days ago
|
|
The solution is proxy everything. The agent doesn't have an api key, or yoyr actual credit card. It has proxies of everything but the actual agent lives in a locked box. Control all input out of it with proper security controls on it. While not perfect it aleast gives you a fighting chance when your AI decides to send a random your SSN and a credit card to block it. |
|
That's the hard part: how?
With the right prompt, the confined AI can behave as maliciously (and cleverly) as a human adversary--obfuscating/concealing sensitive data it manipulates and so on--so how would you implement security controls there?
It's definitely possible, but it's also definitely not trivial. "I want to de-risk traffic to/from a system that is potentially an adversary" is ... most of infosec--the entire field--I think. In other words, it's a huge problem whose solutions require lots of judgement calls, expertise, and layered solutions, not something simple like "just slap a firewall on it and look for regex strings matching credit card numbers and you're all set".