Hacker News new | ask | show | jobs
by Snuggly73 394 days ago
I mean that there is the possibility that swe bench is being specifically targeted for training and the results may not reflect real world performance.