|
|
|
|
|
by PUSH_AX
35 days ago
|
|
They set themselves up for flack when they use whatever these evals are… they did the same for composer 2 which was evaled in close competition with frontier models, spoiler alert, it wasn’t even close in practice. So now 2.5 is supposed to compete with opus 4.7? Sure… |
|