| I went down (continue to do down) this rabbit hole and agree with the author. I tried a few different ideas and the most stable/useful so far has been giving the agent a single run_bash tool, explicitly prompting it to create and improve composable CLIs, and injecting knowledge about these CLIs back into it's system prompt (similar to have agent skills work). This leads to really cool pattens like:
1. User asks for something 2. Agent can't do it, so it creates a CLI 3. Next time it's aware of the CLI and uses it. If the user asks for something it can't do it either improves the CLI it made, or creates a new CLI. 4. Each interaction results in updated/improved toolkits for the things you ask it for. You as the user can use all these CLIs as well which ends up an interesting side-channel way of interacting with the agent (you add a todo using the same CLI as what it uses for example). It's also incredibly flexible, yesterday I made a "coding agent" by having it create tools to inspect/analyze/edit a codebase and it could go off and do most things a coding agent can. https://github.com/caesarnine/binsmith |