Hacker News new | ask | show | jobs
by nerpderp82 816 days ago
MMLU is not a good benchmark and needs to stop being used.

I can't find the section, but at the end of one of https://www.youtube.com/@aiexplained-official/videos he runs down a deep dive of the questions and answers in MMLU, and there are so many typos, omissions, and errors in the questions and the answers that it should no longer be used.

This is it, with the corret time offset into the video https://www.reddit.com/r/OpenAI/comments/18i02oe/mmlu_is_not...

The original longer complaint against MMLU https://www.youtube.com/watch?v=hVade_8H8mE