Hacker News new | ask | show | jobs
by klabb3 390 days ago
Well yes but there is no better way to measure without resorting to pure hearsay. How would you make an accurate assessment of something so inherently vague?
1 comments

Alter the benchmark space that we care about, for example focus only on ARC-AGI-2 and then suddenly the gains are no longer diminishing but are accelerating.