Y
Hacker News
new
|
ask
|
show
|
jobs
by
pants2
181 days ago
Benchmarks are moving closer to reality though with things like FrontierScience and SWE-Bench Pro
1 comments
Ianjit
181 days ago
Maybe you are right, but maybe it’s radiology all over again.
link