Hacker News new | ask | show | jobs
by sdesol 11 hours ago
> at a Sonnet 4.6-level model

MiMo v2.5.0-Pro is honestly the first Chinese model that I've tried where I really though why should I use Claude Sonnet when I can get the same results for a fraction of the cost. There was always something off about Chinese models that made it apparent that it couldn't fully compete with GPT, Claude, Gemini, etc. but this was the first model where I was like, this feels like Sonnet.

I can't prove it, but I think they trained heavily on Claude output. From my perspective I don't care since Anthropic trained on my data.

Using them also works well for North Americans as our peak hours is not theirs.

If I had one complaint, the v2.5.0-Pro model thinks too much.

3 comments

I find deepseek-v4-pro to be every bit as good as sonnet tbh
Is there a guide to running these models locally? Sonnet level inference on my own hardware would be world changing.

I have Claude but I don't want to ask it because Anthropic could decide to sabotage me.

They won't be giving this away, at least not for some years. It almost certainly has distillation data embedded in it, and that would be a smoking gun.
What? I just searched the web and the results say MiMo V2.5 Pro is fully open source. The weights seem to be out there.

Distillation is not a problem.

There are certainly open weights which are available, I don't know if that is what is running on the service.
GLM 5.1 is stronger than Sonnet 4.6 in my opinion, but while they have a coding plan that is a good value MiMo beats it on price. I haven't used MiMo much yet but it felt pretty similar.