| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jgalt212 4 hours ago
	Has anyone run a study on how long you can run an agent as root before irreparable damage is done to the VM? A sort of gambler's ruin for the YOLO LLM Age.

2 comments

nijave 3 hours ago

I gave Sonnet 4.6 root access to my Android via adb and it wrote frida scripts to help me recover the encryption keys from SwiftBackup

Also gave Opus 4.6 access to a Kubernetes container and it was able to use pyrasite (a Python replacement that attached to a running process with gdb) to debug a "memory leak" in Python

I don't think I'd let them run unattended on anything I care about especially if there weren't backups, but they've never tried to break anything while supervised.

Usually it's significantly faster and more accurate to give the LLM/harness access to the thing to debug then to try to copy/paste back and forth.

link

andai 3 hours ago

It's been a while but last year I'd see posts like "Claude nuked my homedir / entire drive" on a regular basis. I don't know if they fixed that (or just made it very rare).

link

nijave 2 hours ago

In fairness to Claude, I've nuked my homedir (had 2 tmux panes open, 1 in home and 1 in /tmp/... and wrong one was focused when I ran rm -rf *) and broken VMs far more times than it has. I now embrace IaC and backups

link

Wowfunhappy 4 hours ago

https://forums.macrumors.com/threads/screw-it-lets-make-clau...

For me, it took a bit over six weeks of Claude running unattended perpetually.

link