Hacker News new | ask | show | jobs
by turingbook 926 days ago
A comment from Boris Power, an OpenAI guy: The top line number for MMLU is a bit gamed - Gemini is actually worse than GPT-4 when compared on normal few shot or chain of thought https://twitter.com/BorisMPower/status/1732435733045199126