Hacker News new | ask | show | jobs
by guilamu 50 days ago
Yes Opus 4.7 fast (no reasoning) did a worst job than Sonnet 4.6 high (with reasoning) according to Gemini 3.1 Pro evaluation.
1 comments

Your table doesn't indicate reasoning vs non-reasoning, or reasoning level
When nothing is noted it's max reasoning (xhigh in copilot chat in vscode if available).

The models not availble on copilot were tested through opencode (max reasoning) and deepseek v4 was tested through Cline (with max reasoning too).