Hacker News new | ask | show | jobs
by threethirtytwo 5 hours ago
We should compare it with a human on the same coding tasks. Same amount of time and the agent will of course finish earlier but with the extra time it double checks and reviews its own code.