|
|
|
|
|
by throwaw12
13 days ago
|
|
IMO sandboxing is not a solution in this case. Imagine a scenario where agent deletes the test code, pushes it and another agent evaluated it as low-risk PR because you are not updating the business logic and PR gets merged to master. |
|
So don't do that. If you want to sandbox an LLM, all output of any consequence needs to pass through a human brain qualified to evaluate whether those consequences are desirable or not. If you don't want to do that because reading LLM output is exhausting, you're free to discover the consequences in some other way, but that doesn't mean sandboxing isn't a solution. It just comes with the tradeoff that you can't outsource all decisions to LLMs.