Hacker News new | ask | show | jobs
To solve the benchmark crisis, evals must think (blog.fig.inc)
6 points by hsikka 241 days ago