Hacker News new | ask | show | jobs
by nvanlandschoot 121 days ago
Method: I used OpenAI’s published SWE-Bench Pro chart points and matched GPT-5.3-Codex-Spark to the baseline model at comparable accuracy levels by reasoning effort. At similar accuracy, the effective speedup is closer to ~1.37× rather than 15×.