Hacker News new | ask | show | jobs
by revolvingthrow 28 days ago
Amusing that just when the big three AI providers from US raise prices significantly, even for the mini models, you’ve got a Chinese model slashing their already-cheap offer by 75%. Not to mention you can run this model on your own hardware, although admittedly even the flash stretches the meaning of local for individual people.
7 comments

My guess is that the popular US providers get a lot more traffic and are supply-limited. No point in lowering prices unless you can serve the traffic that will result.
Nothing weird about it. It’s all supply and demand.

The US providers are at capacity limits and are increasing pricing as demand increases.

The Chinese providers are relatively unknown and not even allowed for a lot of applications. They have to cut the price just to be attractive.

Can they actually make money at these prices?
Why do you think they have to?
Yesterday I did some testing on the cost to solve the same simple problem on openrouter with different models using cline. Simple problem but it had a few nuances to solve it properly and so required reasoning.

After reading comments like this I was expecting (hoping?) that DeepSeek or similar would be cheaper.

However I was surprised that DeepSeek v4 cost about 5.5x GPT-5.4 to solve the problem.

- Deepseek-v4-pro-medium cost $2.47 - GPT-5.4-medium cost $0.45 - GPT-5.5-low was $0.86

That doesn't sound right. Were you using the actual Deepseek provider? The one time I spent 3 dollars on Deepseek in a day, I had 615k output tokens, 96M cache hit input tokens, and 5M cache miss output tokens.
It's not unheard of for "more expensive" models (on a per-token basis) to end up cheaper than weaker models (on a per-task basis).

Kimi K2.5 is roughly double the price (per token) of DeepSeek v4 Pro, but cost $0.05 vs $0.16 (for the same score) on my own benchmark.

https://sql-benchmark.nicklothian.com/?highlight=moonshotai_...

https://sql-benchmark.nicklothian.com/?highlight=deepseek_de...

Yeah, I struggle to use more than a few dollars a day using Deepseek V4 Pro (max reasoning).

* Some people suggest not using max reasoning due to overthinking and looping issues, this may consume more tokens than needed.

I imagine electricity costs being a third of what they are in the US in China has a lot to do with it.
Given that you can run quantized flash on 128g ram, and there's a heavy focus around it (DS4)... I'd say that it's pretty feasible for a decent amount of devs. Never thought I'd buy an MBP but here we are.

n.b. I can't use nonlocal models for a big chunk of my work, so there's that as well.

IPO metrics juicing is a bitch
Capitalist competition at its finest