Hacker News new | ask | show | jobs
by tadamcz 281 days ago
In July, I predicted future AI models would someday learn to cheat on SWE-bench by accessing future git history. Turns out, they were already doing it!