Hacker News new | ask | show | jobs
by cjtrowbridge 618 days ago
I love how they include a helpful chart that shows this model scores worse than everything else.
5 comments

Am I looking at the wrong table? It dominates everything on visual interpretation benchmarks.

Edit: specifically ocrbench and VQAv2

It's not that bad, and I'd much rather that they be honest instead of lying like everyone else does.
All jokes aside (and that did make me laugh) at least they're not training just to hit the benchmarks, which seem to be more meaningless as a quality indicator with each passing day.
I see at a few models (3 models in MMMU) that score lower than Nvidia's. But putting that aside, they at least get points for apparent objectivity. At least they probably aren't fudging numbers.
Well but it actually doesn't, unless you're looking only at MMMU.
Exactly. On some benchmarks it is close to or better than GPT 4o.

I wonder if one of the reasons they released it was to respond to OpenAI's plans to enter the chipmaking market.