Y
Hacker News
new
|
ask
|
show
|
jobs
by
boroboro4
437 days ago
They updated the paper and included Gemini 2.5. It's the only model which got non trivial score (mostly solved one problem) - 10/42.