Hacker News new | ask | show | jobs
Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations (arxiv.org)
2 points by mnk47 568 days ago