Hacker News new | ask | show | jobs
by thebigspacefuck 100 days ago
Grok 4.20-beta1 scores above GPT-5.4-high and just behind Opus 4.6 on LMArena for Text https://arena.ai/leaderboard

I guess for coding if you’re not first you’re last, but this is damn impressive considering. It looked like they pulled the coding model from the benchmarks, but it was similar.

1 comments

Accroding to https://artificialanalysis.ai, it's around Gemini Flash 3, or some of the Chinese open weight models, like GLM 5.

For all the money burned, I am not impressed. Why would I use Mecha Hitler for almost double the cost of Gemini Flash 3?