| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by viksit 323 days ago
	In my fourth post in the series, I tackle how to make multi-step agent workflows learn behavior from data. Most agents today rely on vibes: prompt tuning, hand-written templates, and hope(!). This post is about replacing that with metrics and optimization. Each branch in the workflow learns how to behave, not just where to route. I show how to set up a reward, plug in an optimizer, and treat agent behavior as something you can tune like a model.