Hacker News new | ask | show | jobs
by GaggiX 3 days ago
If MiMo v2.5 Pro can run at >1000tk/s on GPUs then I will soon expect the same from OpenAI/Anthropic/Google.
1 comments

I wouldn't expect any of the american labs to be particularly great (or have much desire) to work on efficiency, they've been consistently proven to be uninterested (if not incapable) of actually improving on those types of things. The closest we've seen lately is that maybe GPT-5.5 (and Opus 4.{7,8}?) are more token-efficient, i.e. they solve things with less tokens...? It hasn't been coupled with any other kind of efficiency bump, though, and we're seeing higher costs anyway in most places where the american labs are involved.

The only players that seem to be capable of a consistent pattern of doing more with less currency are the chinese labs.