|
|
|
|
|
by orbital-decay
54 days ago
|
|
Only if the benchmark is private and done properly on relevant tasks, which is rarely the case. I can guarantee that you have a ton of blind spots if you look at it through the lens of a ranking ladder in some generic tasks. |
|