|
"In this post, I’ll cover a third, not-so-obvious approach: building ways for the agent to validate more of its own work before a human has to step in. " this has been an obvious thing to do since at least January (since Geoffrey Huntley published "everything is a ralph loop"), and this is how I've been working: build enough orchestration tooling to be able to automate everything: development container bringup, building it, running the unit tests, doing integration testing, and using the software as eventually an end user. then to iterate set performance goals on an already solid basis so the automated agent ("gym") can go and iterate autonomously, and let you know when it's "done". I understand this probably does not work if you're on some subscription and not using the API (tokens burn fast), but this has been extremely productive for me. |