Hacker News new | ask | show | jobs
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution (arxiv.org)
1 points by fgfm 890 days ago