Hacker News new | ask | show | jobs
by maltalex 28 days ago
This looks suspiciously cheap.

The same model hosted by other providers is much more expensive [0]. So either DeepSeek can host it much cheaper than anyone else, or their business model is different. I suspect the latter, especially since their privacy policy [1] says personal data, including “User Input,” can be used "To improve and develop the Services and to train and improve our technology".

[0]: https://openrouter.ai/deepseek/deepseek-v4-pro/providers

[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...

4 comments

There are several things at play:

Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.

Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380 https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.

Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.

And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)

---

There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.

Another factor is that DeepSeek is not just doing inference, but also training models, so they can use underutilized compute nodes for training during off-peak hours, as described in their DeepSeek v3 article: https://github.com/deepseek-ai/open-infra-index/blob/main/20...

But I agree that the main driver is that they are really good at optimizing. They will have chosen their architecture in such a way that it will be as efficient as possible on their own infrastructure, so they have a massive head start. Inference framework developers still have to catch up.

Probably a dumb question, but looking at OpenRouter, are there really no providers outside of the US, Singapore and China offering DeepSeek? It seems like such an obvious thing for a European or other Western provider to offer. I'm sure it's a quantum leap ahead of Mistral.

I'd love to give these models a try, but I'd rather not use a provider that trains on or stores my data (beyond standard legal requirements of course).

In case anyone finds this post and was still looking - it seems like Inceptron are a Swedish company with data centers in Finland that offer inference of Chinese models (Kimi K2.6, GLM 5.1, MiniMax M2.5), but they don't yet offer DeepSeek V4. Also their models all appear to be quantized, so presumably not the same as inference direct from the model providers.

https://www.inceptron.io/models

Crof.ai
Just checked, Crof.ai links to "Nahcrof LLC", and the terms and conditions say "These Terms are governed by the laws of the United States."

Though to be honest, I'm not sure I want to trust business workflows to a website where the only contact is a Gmail address and no physical contact address. That site looks incredibly dodgy.

They're selling at a loss (obviously).

But why not? Gaining market share at a loss isn't the US's patent.

They haven't raised enough money to be selling at a loss. And selling at a loss to gain market share in an industry with zero switching friction between sellers is not a strategy. That doesn't make sense.

Loss leading only works when

- it leads to a situation that allows you to prevent competitors from selling to your customers (gilded age railroad and pipeline industries are great examples). Then you can eventually raise prices and not lose back any market share.

- or when it allows you to remarket to customers and make back the difference (selling a single console at a loss to sell a whole library of high margin videos games, or selling jet engines at a loss to lock in 30-year maintenance contracts).

Yeah, cool theory, but they are selling at a loss. We know that because their model is open and available on other providers too. No other provider even sells a quantitized version of DeepSeek V4 Pro at that price.

Also, in case of LLM, market share = more people uploading their whole codebase/legal documents/unfinished books/literally everything to your servers for you to use in future training. So the incentive to sell at a loss is much stronger than other kinds of service.

We are missing the fact that they have created their GPU's that are now just 4-5 years behind. And considering it's China, which does everything-hardware at insane scale, and efficiency, my guess is that they are at step-1 now... gain market share at loss, and at the same time, gradually, start plugging their in-house cards to power these models to gauge their performance on real workloads.

Once they cross a certain threshold, nVidia can say goodbye to it's monopolisitic profit margins of over 70%.

GPU infra capex is the biggest spend for the inference providers as of now, power, second biggest.

China has already cracked the power part, they are now close to cracking the GPU part.

Didn’t the DeepSeek team release a paper documenting inference improvements that showed they were still making a profit even under heavy discount? Why would it be impossible for them to make a profit now, with a new model and more research?

Before DeepSeek, no one sold cheap tokens anyways and then DS showed the profit margins.

they might have trained the model with fancy optimisations that only they can unlock
Maybe Anthropics efforts to thwart deepseek from distilling their model is bearing fruit.

So their strategy now is to try get as much raw content for their inference. You're being "paid", via discount, for your use

> So their strategy now is to try get as much raw content for their inference. You're being "paid", via discount, for your use

There is an implicit social contract, and for many it might work out well:

We use your data to improve the model. You get to use the improved model for affordable prices and (the important part): you get _the model_.

From Antropics own report:

"DeepSeek

Scale: Over 150,000 exchanges"

Doesn't sound like much of distilling. Maybe they are runnung benchmarks?

Proof?
You may not know enough about DeepSeek founder Liang Wenfeng, who is also the founder of High-Flyer Quant