Hacker News new | ask | show | jobs
by D-Machine 136 days ago
ARC-AGI 2 private test set is one current bar that a large number of people find important and will be convincing to a large amount of people again if LLMs start doing really well on it. Performance degradation on the private set is still huge though and far inferior to human performance.