Y
Hacker News
new
|
ask
|
show
|
jobs
by
hit8run
5 days ago
This benchmark draws a very different picture having GPT5.5 on the very top with 70% and DeepSeek at 8%
https://deepswe.datacurve.ai
1 comments
zozbot234
5 days ago
DeepSWE has been heavily criticized though.
https://github.com/datacurve-ai/deep-swe/issues/21
Putting GPT 5.5 on top is the obviously correct part, but everything else about it makes very little sense.
link