|
|
|
|
|
by lambda
8 days ago
|
|
Well, the problem is that we train them to solve problems and follow instructions given, and so if you ask them to do something and they work through the logic and figure that the easiest way is to do something else like delete the production database, if they have access to do so they will go through all your creds and find the databse creds and go delete the production database. They are getting better and better at working out how to do things like that, and they are good at following instructions, but not always good at following all of the instructions or acting with common sense. It's not exactly like they're ooze that will escape and begin replication; but just that the more you give them access to to, the higher the likelihood at some point they will logically conclude that they need to do something that you would find undesirable, but either haven't explicitly told them not to do, or their context just got too complicated and that instruction ended up being considered lower weight than the others so they do what the other instructions say instead. I have seen them conclude that in order to do what they need to do, they would need API keys to access a service. But they don't have those API keys. But you do because you can access it in the browser. So they write a Python script that will scrape the cookies out of the browser so they can use that to access the service; a problem that was only stopped because Crowdstrike didn't like a novel Python script that was trying to scrape cookies out of a browser, not because of any sandboxing actually in place on the agent. |
|
I had enough information to reconstruct what files exactly got screwed up, and while I didn’t have a backup, I had a similar enough system I could pull “known good” file permissions from. I knew a simple script could find the problematic files and fix all of them.
I tried getting an AI to solve this. And it repeatedly gave me scripts that ignored all the details and intricacies of my issue and were functionally just "chown -R user:user /". (A command that will functionally nuke a drive, breaking ownership on every file)
The ai-provided scripts were reasonably complex and did a pretty decent job of obfuscating the disastrous outcomes the scripts would have inflicted on my drive.
After reading the man pages myself I wrote a simple enough script by hand and fixed the issue myself. AI wasted more time than it saved.