|
|
|
|
|
by WarmWash
53 days ago
|
|
The open models only give the SOTA models a run for their money on gameable benchmarks. On the semi-private ARC-AGI 2 sets they do absolutely awfully (<10% while SOTA is at ~80%) It might be too expensive, but I would be interested in the benchmarks for the current crop of SOTA models. |
|
[0] https://arcprize.org/leaderboard
[1] https://eqbench.com/index.html