| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by smoe 30 days ago

Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.

I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.

So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.

They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price

4 comments

try-working 30 days ago

I just did a little comparison using benchmarks for GPT 5.1 through 5.4 to map out the equivalent capability-level of some of the Chinese models.

Based on these benchmarks, here's a rough mapping:

- Qwen 3.7 ~= GPT 5.3

- Kimi K2.6 ~= GPT 5.15

- DS V4 ~= GPT 5.1

So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.

Here's the benchmark I used since I can't post images here: https://x.com/trydotworks/status/2058004995195490706?s=20

link

_under_scores_ 30 days ago

I switched to predomentantly using mimo this week, mostly out of curiosity to see how dependant I was on frontier models. Honestly I cant really tell the difference. I would say I work on pretty average codebases with well know frameworks doing pretty typical things and initial impressions is that mimo, kimi and deepseek can probably handle what I need more or less the same as gpt5.5 or claude.

link

c0rruptbytes 30 days ago

I personally really like DS4 Flash - it's the largest I can run locally with decent speeds and I feel like it's good enough to maintain a codebase with less effort

link

r0b05 30 days ago

What hardware and quant do you run it with?

link

maxdo 30 days ago

maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.

link

JSR_FDED 29 days ago

Are you going through OpenRouter or direct? I’ve had nothing short of excellent results from Kimi.

link