|
|
|
|
|
by bawolff
138 days ago
|
|
Paper not about benchmarking or ML research is bad from the perspective of benchmarking. Not exactly a shocker. The authors themselves literally state: "Unlike other proposed math research benchmarks (see Section 3), our question list should not be considered a benchmark in its current form" |
|
Sounds to me to be a benchmark in all but a name. And they failed pretty terribly at achieving what they set out to do.