|
|
|
|
|
by nvanlandschoot
121 days ago
|
|
Method: I used OpenAI’s published SWE-Bench Pro chart points and matched GPT-5.3-Codex-Spark to the baseline model at comparable accuracy levels by reasoning effort. At similar accuracy, the effective speedup is closer to ~1.37× rather than 15×. |
|