|
|
|
|
|
by jaen
99 days ago
|
|
People building these Rube Goldberg contraptions: Do you actually run evaluations if this is any better at all than eg. giving it access to a Python REPL, or just toughing it out with random tools composed via shell scripts? Why would an LLM be better trained to access Redis via a FS vs. a native library API? Makes no sense. |
|
Limiting the potential blast radius.
If you give an agent "access to a Python REPL" (your words), you're giving it access to all of Python. i.e. you're paving the road to your own destruction when the agent goes awry. In the case of a Python interpreter, you're basically handing it an eight-lane highway upon which all sorts of pile-ups and havoc can happen.
By limiting its access to specific operations via well-defined endpoints (which is what the AGFS approach is), you're trimming that eight-lane highway back to a bicycle path.