I was amused to notice that the Gemini CLI leans into this, with a `--yolo` flag that will skip confirmation from the user before running tools. Or you can press Ctrl-Y while in the CLI to do the same thing.
Interesting! So this is kinda like whole-program static analysis, but the "program" is like eBPF - no loops, no halting problem, etc. This is great for defence in depth (stops the agent from doing the wrong thing), but IMO the process still needs sandboxing (RCE).
I would love to see a cross-platform sandboxing API (to unify some subset of seccomp, AppCointainer, App Sandbox, pledge, capsicum, etc), perhaps just opportunistic/best-effort (fallback to allow on unsupported capability/platform combinations). We've seen this reinvented over and over again for isolated execution environments (Java, JS, browser extensions...), maybe this will finally trigger the push for something system-level, that any program can use.
Yeah, the CaMeL approach is mainly about data flow analysis - making sure to track how any sources of potentially malicious instructions flow through the system. You need to add sandboxes to that as well - and the generated code from the CaMeL process needs to run in a sandbox.
I was amused to notice that the Gemini CLI leans into this, with a `--yolo` flag that will skip confirmation from the user before running tools. Or you can press Ctrl-Y while in the CLI to do the same thing.