|
|
|
|
|
by hollerith
387 days ago
|
|
I'm confused: can you explain how the sandbox helps? I mean, if the plan is not to let the AI write any code that actually gets allocated computing resources and not to let the AI interact with any people and not to give the AI write access to the internet, then I can see how having a good sandbox around it would help, but how many AI are there (or will there be) where that is the plan and the AI is powerful enough that we care about its alignedness? |
|
You start with the low hanging fruit: run tool commands inside a kernel sandbox that switches off internet access and then re-provide access only via an HTTP proxy that implements some security policies. For example, instead of providing direct access to API keys you can give the AI a fake one that's then substituted by the proxy, it can obviously restrict access by domain and verb e.g. allow GET on everything but restrict POST to just one or two domains you know it needs for its work. You restrict file access to only the project directory, and so on.
Then you can move upwards and start to sandbox the sub-components the AI is working on using the same sort of tech.