|
|
|
|
|
by csomar
544 days ago
|
|
I don't care about benchmarks. O1 ranks higher than Claude on "benchmarks" but performs worse on particular real life coding situations. I'll judge the model myself by how useful/correct it is for my tasks rather than a hypothetical benchmarks. |
|