No, and they're lying on the most important claim: that this is not a model specialized to IMO problems.
From the thread:
> just to be clear: the IMO gold LLM is an experimental research model.
The thread tried to muddy the narrative by saying the methodology can generalize, but no one is claiming the actual model is a generalized model.
There'd be a massively different conversation needed if a generalized model that could become the next iteration of ChatGPT had achieved this level of performance.
There are trillions of dollars at stake in hyping up these products; I take everything these companies write with a cartload of salt.