|
|
|
|
|
by viksit
323 days ago
|
|
In my fourth post in the series, I tackle how to make multi-step agent workflows learn behavior from data. Most agents today rely on vibes: prompt tuning, hand-written templates, and hope(!). This post is about replacing that with metrics and optimization. Each branch in the workflow learns how to behave, not just where to route. I show how to set up a reward, plug in an optimizer, and treat agent behavior as something you can tune like a model. |
|