Hacker News new | ask | show | jobs
by pixl97 121 days ago
I mean for this particular benchmark, yes.

You'd have to put it in an agentic loop to perform corrections otherwise.