|
|
|
|
|
by shiandow
58 days ago
|
|
For a company that puts DO NOT FUCKING GUESS in their instructions they made a heck of a lot of assumptions - assume tokens are scoped (despite this apparently not even being an existing feature?) - assume an LLM didn't have access - assume an LLM wouldn't do
something destructive given the power - assume backups were stored somewhere else (to anyone reading, if you don't know where they are, you're making the same assumption) Also you should never give LLMs instructions that rely on metacognition. You can tell them not to guess but they have no internal monologue, they cannot know anything. They also cannot plan to do something destructive so telling then to ask first is pointless. A text completion will only have the information that they are writing something destructive afterwards. |
|
Personally I don't even let my agent run a single shell command without asking for approval. That's partly because I haven't set up a sandbox yet, but even with a sandbox there is a huge "hazard surface" to be mindful of.
I wonder if AI agent harnesses should have some kind of built-in safety measure where instead of simply compacting context and proceeding, they actually shut down the agent and restart it.
That said I also think even the most advanced agents generate code that I would never want to base a business on, so the whole thing seems ridiculous to me. This article has the same energy as losing money on NFTs.