Hacker News new | ask | show | jobs
by Cicero22 312 days ago
Where did you get the top ten from?

https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro

Are you discounting all of the self reported scores?

1 comments

Came here to say this. It's behind the 14b Phi-reasoning-plus (which is self-reported).

I don't understand why "TIGER-LAb"-sourced scores are 'unknown' in terms of model size?