Hacker News new | ask | show | jobs
by andyk 544 days ago
That has a double meaning - half tongue in cheek.

1) since we are creating a contamination-free version of SWE-bench (i.e. scraping a new test set after submissions are frozen) it is guaranteed that agents in this contest can't "cheat", i.e., models can't have trained on the benchmark / agents cant memorize answers.

2) as a general rule in life, don't cheat on things (not that there aren't exceptions)