Hacker News new | ask | show | jobs
by acro-v 104 days ago
Because it's not true? The beating portion, not the cheaper portion.
1 comments

Why is it not true?

SWE bench is the standard bench to measure an LLMs coding capabilities