|
|
|
|
|
by staticassertion
90 days ago
|
|
Eh, I don't know. I mean, are we seeing better models now? Of course. But are they truly leaps and bounds better? No, and I get confused by people saying that they are. They're better but not like... 10x better. And when people were studying ChatGPT 3.5, everyone would go "Oh, but that wasn't 4!", and when people talk about Opus 4.5 they go "4.6 is so much better!". My personal position right now is that people are extremely bad at evaluating model output/ changes in model capabilities. Model benchmarks do not reflect the position that models are just 10x better than they were a year ago, but with how people discuss them you'd think that 10x was underselling it. |
|