|
Who would have thought that having access to the whole system can be used to bypass some artificial check. There are tools for that, sandboxing, chroots, etc... but that requires engineering and it slows GTM, so it's a no-go. No, local models won't help you here, unless you block them from the internet or setup a firewall for outbound traffic. EDIT: they did, but left a site that enables arbitrary redirects in the default config. Fundamentally, with LLMs you can't separate instructions from data, which is the root cause for 99% of vulnerabilities. Security is hard man, excellent article, thoroughly enjoyed. |
This is the only way. There has to be a firewall between a model and the internet.
Tools which hit both language models and the broader internet cannot have access to anything remotely sensitive. I don't think you can get around this fact.