|
|
|
|
|
by benjosaur
110 days ago
|
|
yes exactly. with proper configuration (e.g. /sandbox with normal claude code) it is impossible for the agent to escape. agent orchestrations/wrappers that aim to eliminate friction however subtly override these proper setups, leading to the nasty scenario of: 1) you assuming anthropic's /sandbox is keeping you safe
2) the model reaffirms your belief in that /sandbox is keeping you safe
3) you are not safe
4) you leave your agent running overnight and goal drift deletes your os |
|