|
|
|
|
|
by usernametaken29
40 days ago
|
|
I worked extensively on ARC AGI before and one thing is SURE as hell. OpenAI and Gemini in particular use this as marketing material. You can correlate the benchmark release with stock price increase. They feed synthetic datasets of ARC into their models to boost the numbers. There is no doubt in my mind Gemini is no better than DeepSeek other than being specifically fine tuned for ARC AGI. Heck, they even say so and they say they have paid annotations for ARC. Again, economic incentives.
In terms of whether these models are actually better at the benchmarks, likely not. See ARC 3, where the gap is diminishingly small. |
|