| the models will never avoid egregious behavior. think of it like every "good intentions" morality tale. theres almost always some geniune context where that behavior is wanted. instead, the coding harness or determinative tool, will need hardcoded security features. in opencode, almost all the power comes from bash and all other permissions are just chrades. its powerful and insecure because of it. you can sand box them but then you fight the sandbox to pipe in your assets. the sandbox becomes porous because elsewise its useless. MCPs dont address much either. want we are looking for is a portal or protocol that has the model and harness and the actions tunneled, like ssh, to some fixed scoped and limited shell along side the assets. then, the user and LLM can the negotiate assets and actions as needed via the protocol. but alas, as your comment suggests, people thing theres some perfect context thatll prevent bad things from happening. the libertarian paradise without regulation. |
Take a look at a project I just finished this weekend: https://clawband.io
It's an agent permissioning platform that isolates your service connections and puts a granular permissioning layer on it. So rather than your agent getting full access to a service, they get a Clawband key that can be used to request actions then Clawband checks the parameters to see if it is allowed.
The classical example I have made is allowing your agent access to privacy.com. You may want it to be able to list your cards but not create one or you may want to allow creating cards but only a certain limit.
The plan is to make it open-source and allow self-hosting because security / sanity of users but still have a SaaS offering as a demo / ease of use.