Hacker News new | ask | show | jobs
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks (arxiv.org)
1 points by ziptron 426 days ago