|
|
|
|
|
by zarzavat
510 days ago
|
|
OpenAI played themselves here. Now nobody is going to take any of their results on this benchmark seriously, ever again. That o3 result has just disappeared in a poof of smoke. If they had blinded themselves properly then that wouldn't be the case. Whereas other AI companies now have the opportunity to be first to get a significant result on FrontierMath. |
|
[1]: https://epoch.ai/math-problems/submit-problem - the benchmark is comprised of "hundreds" of questions, so at the absolute lowest it cost 300 * 200 = 60,000 dollars.