|
|
|
|
|
by robbomacrae
46 days ago
|
|
I don't think that is entirely fair.. I don't see them stating anywhere they are measuring coding capabilities... "Using complex games to probe real intelligence." And this seems very much in line with the methodology in ARC-AGI-3. The results here, in the OP article and in https://www.designarena.ai all tell a similar story: Kimi K2.6 is up and in the SOTA mix. |
|