Hacker News new | ask | show | jobs
by jstanley 300 days ago
I scrolled down but didn't find a chart comparing average human performance to AI performance.

The only chart I found was comparing the costs of different models.

1 comments

Sorry, you're right that the chart on the home page does not have human performance. The leaderboard chart does: https://arcprize.org/leaderboard. And the leaderboard by default shows scores for ARC-AGI 1 and 2. The models are much worse at 2 than 1; the best performing model scores around 15% (Grok 4, thinking), while humans are at ~100%.
Thanks, and do we know if the humans are average people off the street, or unusually-intelligent people?

EDIT: OK, I see there are 3 types of humans:

"Avg. Mturker" does worst. "Stem Grad" and "Human Panel" are basically equivalent in terms of quality but differ in cost.

It's not obvious to me whether an average Mturker would be more or less clever than the average person. Mturk doesn't pay very well, so you'd think you'd have to be below average to want to do it. But potentially it attracts people of above-average intelligence who just happen to live in the third world?

Additional caveat: some of the "avg mturker" cohort are almost certainly using LLMs to participate.