Hacker News new | ask | show | jobs
by MisterBiggs 78 days ago
What happens once an agent can reliably get 100% on swebench?