| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yorwba 370 days ago
	The ARC-AGI-2 paper https://arxiv.org/pdf/2505.11831#figure.4 uses a non-representative sample, success rate differs widely across participants and "final ARC-AGI-2 test pairs were solved, on average, by 75% of people who attempted them. The average test-taker solved 66% of tasks they attempted. 100% of ARC-AGI-2 tasks were solved by at least two people (many were solved by more) in two attempts or less." Certainly those non-representative humans are much better than current models, but they're also far from scoring 100%.