Hacker News new | ask | show | jobs
by seanw265 57 days ago
Kimi K2.6 also released today. I think it's fair to compare the two models.

Qwen appears to be much more expensive:

- Qwen: $1.3 in / $7.8 out

- Kimi: $0.95 in / $4 out

--

The announcement posts only share two overlapping benchmark results. Qwen appears to score slightly lower on SWE-Bench Pro and Terminal-Bench 2.0.

Qwen:

- Teminal-Bench 2.0: 65.4

- SWE-Bench Pro: 57.3

Kimi:

- Terminal-Bench 2.0: 66.8

- SWE-Bench Pro: 58.6

--

Different models have different strong suits, and benchmarks don't cover everything. But from a numbers perspective, Kimi looks much more appealing.

2 comments

I wonder if this means a better Cursor Composer model update is coming, since it builds on top of Kimi K2.
Cursor would have to run their RL pipeline all over again if they wanted to build a new Composer on K2.6, so almost definitely not.
Why wouldn't they? If they can tout improvements over Composer 2, it'll be worth the training cost.
i think as the pricing has gone up on the Chinese models it has made them less appealing, and with the introduction of Gemma-4 not many are at the pareto frontier (also in my experience, not just the stats): https://arena.ai/leaderboard/text/overall?viewBy=plot
FWIW in my recent testing I couldn't find a better model than Gemma 4 31B for the price (openrouter only). My use case was taking discussions and identifying business ideas, so somewhat conceptual problem solving type thing.