Hacker News new | ask | show | jobs
by onlyrealcuzzo 312 days ago
I will only add that it's interesting that in the results graphic, they simply highlighted Opus 4.1 - choosing not to display which models have the best scores - as Opus 4.1 only scored the best on about half of the benchmarks - and was worse than Opus 4.0 on at least one measure.