MMLU is not a good benchmark and needs to stop being used.
I can't find the section, but at the end of one of https://www.youtube.com/@aiexplained-official/videos he runs down a deep dive of the questions and answers in MMLU, and there are so many typos, omissions, and errors in the questions and the answers that it should no longer be used.