Y
Hacker News
new
|
ask
|
show
|
jobs
by
davidheineman
60 days ago
SWE-bench is fantastic! IMO, the scrutiny is a byproduct of the adoption and success of the benchmark.