Hacker News new | ask | show | jobs
by tracker1 41 days ago
I think I'd go a slightly different route, if I was trying to do this, and that would be to give each agent at least a VM. Not to mention an email account, so that they can coordinate/collaborate with the other "developers" ...

In the end, I firmly believe that agents need a lot more guidance in terms of direction than what a lot of people seem to be giving. Let alone code reviews.

1 comments

VMs bring greater isolation but they're a lot heavier and slower. The agents just use github for synchronization here, though I've been considering building some kind of todo list overlay locally.
Yes... but with full VMs, you can integrate docker (compose) into the application workflows without risking conflicts between separate agents on the same system/vm.
Did you read the post? That's exactly the problem I just solved.
That's the part of your post I have trouble understanding. That you need to work around colliding ports suggests that the containers spun up by the agent run directly on the host, not inside some form of nested containerization. But if you do that, how do you ensure that the application running in those containers is sandboxed just as strictly as the agent itself?
The docker compose stack for the applications is spun up on the host. The agents have access to the docker socket which means they can talk to docker from inside their sandbox and spin up new sibling containers on the host. Yolobox isn’t designed for full isolation- just accidental commands you wouldn’t want to run on the host, and a convenient way of giving agents a customizable environment they control.

Early on in development I tried to harden the container to prevent deliberate escapes by the agent. This was a waste of time as the agents just kept finding more and more exploits when I asked them to try and break out.

So the right way to use yolobox is to spin up one VM as a secure sandbox, and then use yolobox to separate individual agents within the VM?
Sounds interesting. What kind of exploits did they find, apart from docker being exposed?