|
|
|
|
|
by TheDong
92 days ago
|
|
> people have had their entire network compromised by bots they left running overnight I'm curious if you have references to this happening with OpenClaw using one of the modern Opus/Sonnet 4.6 models. Those models are a bit harder to fool, so I'm curious for specific examples of this happening so I can do a red-team on my claw. I've already tried all sorts of prompt injections against my claw (emails, github issues, telling it to browse pages I put a prompt injection in), and I haven't managed to fool it yet, so I'm curious for examples I can try to mimic, and to hopefully understand what combination of circumstances make it more risky |
|
Just today I had Opus 4.6 in Claude Code run into a login screen while building and testing a web app via Playwright MCP. When the login popped up (in a self-contained Chromium instance) I tried to just log in myself with my local dev creds so Claude would have access, but they didn't work. When I flipped back to the terminal, it turned out Claude had run code to query superadmin users in the database, picked the first one, and changed the password to `password123` so it could log in on its own.
This was a sandboxed local dev environment, so it was not a big deal (and the only reason I was letting it run code like that without approval), but it was a good reminder to be careful with these things.