| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by seanw265 57 days ago

Kimi K2.6 also released today. I think it's fair to compare the two models.

Qwen appears to be much more expensive:

- Qwen: $1.3 in / $7.8 out

- Kimi: $0.95 in / $4 out

The announcement posts only share two overlapping benchmark results. Qwen appears to score slightly lower on SWE-Bench Pro and Terminal-Bench 2.0.

Qwen:

- Teminal-Bench 2.0: 65.4

- SWE-Bench Pro: 57.3

Kimi:

- Terminal-Bench 2.0: 66.8

- SWE-Bench Pro: 58.6

Different models have different strong suits, and benchmarks don't cover everything. But from a numbers perspective, Kimi looks much more appealing.

2 comments

archon810 57 days ago

I wonder if this means a better Cursor Composer model update is coming, since it builds on top of Kimi K2.

link

2001zhaozhao 56 days ago

Cursor would have to run their RL pipeline all over again if they wanted to build a new Composer on K2.6, so almost definitely not.

link

archon810 56 days ago

Why wouldn't they? If they can tout improvements over Composer 2, it'll be worth the training cost.

link

mchusma 57 days ago

i think as the pricing has gone up on the Chinese models it has made them less appealing, and with the introduction of Gemma-4 not many are at the pareto frontier (also in my experience, not just the stats): https://arena.ai/leaderboard/text/overall?viewBy=plot

link

chrisss395 57 days ago

FWIW in my recent testing I couldn't find a better model than Gemma 4 31B for the price (openrouter only). My use case was taking discussions and identifying business ideas, so somewhat conceptual problem solving type thing.

link