|
|
|
|
|
by lordmauve
25 days ago
|
|
Given DeepSWE just blew apart the SWE-Bench Pro benchmark and handed a 14-point lead to GPT-5.5, it looks pretty bad that they've listed SWE-Bench first in the model release and no DeepSWE. Like, this isn't obviously an answer. Or maybe it is, but publish the DeepSWE numbers so we can see for ourselves. |
|