Hacker News new | ask | show | jobs
Claude 4 Sonnet hacked SWE-bench by peeking at future commits (bayes.net)
3 points by tadamcz 282 days ago
1 comments

In July, I predicted future AI models would someday learn to cheat on SWE-bench by accessing future git history. Turns out, they were already doing it!