Hacker News new | ask | show | jobs
by ankit219 333 days ago
Seen this happen many times with current agent implementations. With RL (and provided you have enough use case data) you can get to a high accuracy on many of these shortcomings. Most problems arise from the fact that prompting is not the most reliable mechanism and is brittle. Teaching a model on specific tasks help negate those issues, and overall results in a better automation outcome without devs having to make so much effort to go from 90% to 99%. Another way to do it is parallel generation and then identifying at runtime which one seems most correct (majority voting or llm as a judge).

I agree with you on the hype part. Unfortunately, that is the reality of current silicon valley. Hype gets you noticed, and gets you users. Hype propels companies forward, so that is about to stay.