| Thanks for feedback, let me break it down: Code is just a representation of the agent execution flow, different than static workflows like in Glide, or typical ping-pong type of re-act agent in any other framework. It's not meant to replace your application logic, but serve as a dynamic plan the LLM is generating to complete your ad-hoc task, so there is a little sense of committing it to repo. In your app code, control flow also changes based on data using `if` statements. And you don't need to know the data to write your code, right? LLM can handle those as well. We are working on making the execution more dynamic, possibly allowing code to be generated and executed progressively when data arrives and new prompts are given. Tools will indeed need to be written as idempotent, especially for the interactivity and human in the loop mechanism we are working on. Detecting recursive loops and other issues is a technical and solvable problem, we'll get to those as well. Running code in deno or e2b, you cannot pause it to wait for human interaction for a day, then resume it, so it won't really help with interactive agents doing really long running operations and interacting with real world. |
Yes. Which is why it's in my app code, committed to source code, and with a guarantee of determinism. If you don't consult the runtime information (i.e. data), there's no reason why it has to be generated at runtime.
> We are working on making the execution more dynamic, possibly allowing code to be generated and executed progressively when data arrives and new prompts are given.
I think you just described how a react agent works.
> you cannot pause it to wait for human interaction for a day, then resume it
Sure you can! Have you seen langgraph, trigger, temporal?
Also, how do you pause based on an AST that might get invalidated if you recompute it on new contextual information?
Does the human keep approving until your AST works?
Sorry to be terse, but surely I'm missing something here.
Since you're computing the AST upfront, you'd get it wrong more times than a react agent that does it as and when needed. So, do I pay OpenAI everytime you make a mistake on the AST?
Or is your claim that your AST is infallible?