| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by isityettime 19 days ago

> If you want to work on the code then you need to have access to the repositories, so you need the github token.

Definitely not! I only have an agent work in one repo at a time, with cross-repo work coordinated by me. I have a ton of local checkouts and leave them visible read-only to all of my agents. They can look at company code in my local checkouts, and they can download or browse open-source code, or look at it in the .src outputs of packages from Nixpkgs.

> Then, to test the app, you may need your own backend token.

I just don't let my agents test apps that run remotely, for better or for worse.

> And VPN.

This doesn't really expose anything on my system because everything internal that it could hit is authenticated, and it can't access any of my credentials. But I could do a better job restricting network access.

> your branch of the code is in danger

The agent isn't permitted by the sandbox to read the secrets it needs for `git push`. Indeed, I have commit signing enabled and the agent can't even read the files it needs for git commit! It can write code, it can write tests, it can run some tests, and it can run web applications locally and play with those.

But then I do the final testing and then turn its changes into 1-5 git commits, walking through them and selectively staging, skipping, or dropping them hunk-by-hunk according to my judgment. I still do tons of review. I just don't review edits or commands; instead I review and test whole drafts, whole changesets. It's less fatiguing because the thing I'm reviewing is more directly the thing I'm trying to produce.

I guess it ain't YOLO nirvana but I wasn't really looking for that.

1 comments

prerok 19 days ago

Thank you for the explanation but I still don't quite get it. Is this code mounted to a separate VM where the agent is running? I mean, how does the sandboxing of agents really work?

The reason I am asking is because if it's not sandboxed on the OS level, then commands it runs may escape the harness sandboxing. Even more problematic can be a command added to some auto running script that will get executed at some point outside of the sandbox (when the developer is doing actions). So, reviewing everything before anything is executed seems like the only safe way to do it. What am I missing?

isityettime 19 days ago

The tool I use currently is OS-level sandboxing (the OS does the sandboxing), not sandboxing built into the harness (like what Codex has turned on by default) or hypervisor-level sandboxing (i.e., the agent sees an OS that is sandboxed or an OS that constitutes the sandbox). To relax or adjust the sandbox, I have to kill the agent and reinvoke the sandbox with a new policy, which then relaunches the agent.

> Even more problematic can be a command added to some auto running script that will get executed at some point outside of the sandbox (when the developer is doing actions).

That's a real potential problem, but unfortunately the default "approve every edit" regime doesn't actually address it, either. In the normal per-command approval process, the approvals are often just suggestions; Claude will do things like silently edit files in "plan mode" anyway, for example.

If you're deeply worried about this particular kind of sandbox escape you probably don't want the agent's checkout to be your usual checkout. Then if you do have some scripts that can run automatically inside a project directory (e.g., via direnv), you just never approve them in the path to the agent's checkout and make sure direnv's state dir is unwritable inside your sandboxes. If you have code inside your project that runs without any user intervention at all, and has no approval process at all so that it will be activated or trusted even on a fresh clone you've never visited or seen before... yikes. That sucks. :(

Anyway if you take the precaution above you can still review edits to those files before they have a chance to run (or just never run them).

One thing suggested by another user in this discussion that sounds like a useful approach to me is also giving the agent a VM from which they can push to a local bare clone or something like that so that's how they emit code to you. That way they're not writing scripts to your box at all.