Hacker News new | ask | show | jobs
Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design (huggingface.co)
1 points by heyitsguay 369 days ago