Hacker News new | ask | show | jobs
by KoolKat23 71 days ago
Perhaps I'm wrong, but definitely seems to be SOTA. Although looking at it's ARC-AGI-2 score it's reasoning isn't very good. I suspect it's got the benefits of scale but lacks that human added element, understandable considering they claim to be building it from the ground up. This should come in time if they have a good team. In real life, I'd imagine one would worry about overfitting when using it.

(I'm not using it as I'm not agreeing to their ad terms).