Hacker News new | ask | show | jobs
by ej88 30 days ago
swe bench pro has a public and private test set, where the private eval is from proprietary codebases only