|
|
|
|
|
by haffi112
315 days ago
|
|
It makes it look like the presentation is rushed or made last minute. Really bad to see this as the first plot in the whole presentation. Also, I would have loved to see comparisons with Opus 4.1. Edit: Opus 4.1 scores 74.5% (https://www.anthropic.com/news/claude-opus-4-1). This makes it sound like Anthropic released the upgrade to still be the leader on this important benchmark. |
|
Or written by GPT-5?