Hacker News new | ask | show | jobs
by ivanovm 5 days ago
The benchmarks are now the equivalents of SAT/ACT/other standardized exams for humans. They are directionally quite predictive, but with plenty of outcome variance on the margins