Hacker News new | ask | show | jobs
by thelinuxkid 103 days ago
Why is it not true?

SWE bench is the standard bench to measure an LLMs coding capabilities