|
|
|
|
|
by vjerancrnjak
442 days ago
|
|
If it overfits on the whole internet then it’s like a search engine that returns really relevant results with some lossy side effect. Recent benchmark on unseen 2025 Math Olympiad shows none of the models can problem solve . They all accidentally or on purpose had prior solutions in the training set. |
|
https://x.com/mbalunovic/status/1907436704790651166