|
|
|
|
|
by stingraycharles
166 days ago
|
|
It’s a shame but it’s also understandable that they cannot compete with SOTA models like Sonnet and Opus. They’re focused almost entirely on benchmarks. I think Grok is doing the same thing. I wonder if people could figure out a type of benchmark that cannot be optimized for, like having multiple models compete against each other in something. |
|