|
|
|
|
|
by hazrmard
166 days ago
|
|
This reflects my experience. Yet, I feel that getting reliability out of LLM calls with a while-loop harness is elusive. For example - how can I reliably have a decision block to end the loop (or keep it running)? - how can I reliably call tools with the right schema? - how can I reliably summarize context / excise noise from the conversation? Perhaps, as the models get better, they'll approach some threshold where my worries just go away. However, I can't quantify that threshold myself and that leaves a cloud of uncertainty hanging over any agentic loops I build. Perhaps I should accept that it's a feature and not a bug? :) |
|
> - how can I reliably call tools with the right schema?
This is typically done by enabling strict mode for tool calling which is a hermetic solution. Makes llm unable to generate tokens that would violate the schema. (I.e. LLM samples tokens only from the subset of tokens that lead to valid schema generation.)