Hacker News new | ask | show | jobs
by yalok 3 hours ago
if there're some specific tests/evals to satisfy that an agent can test by itself, it can easily iterate for hours. And this time also includes running those tests/evals, which may not be small.