|
|
|
|
|
by schmichael
159 days ago
|
|
I'm unconvinced we're as powerless as LLM companies want you to believe. A key problem here seems to be that domain based outbound network restrictions are insufficient. There's no reason outbound connections couldn't be forced through a local MITM proxy to also enforce binding to a single Anthropic account. It's just that restricting by domain is easy, so that's all they do. Another option would be per-account domains, but that's also harder. So while malicious prompt injections may continue to plague LLMs for some time, I think the containerization world still has a lot more to offer in terms of preventing these sorts of attacks. It's hard work, and sadly much of it isn't portable between OSes, but we've spent the past decade+ building sophisticated containerization tools to safely run untrusted processes like agents. |
|
This is coming from first principles, it has nothing to do with any company. This is how LLMs currently work.
Again, you're trying to think about blacklisting/whitelisting, but that also doesn't work, not just in practice, but in a pure theoretical sense. You can have whatever "perfect" ACL-based solution, but if you want useful work with "outside" data, then this exploit is still possible.
This has been shown to work on github. If your LLM touches github issues, it can leak (exfil via github since it has access) any data that it has access to.