Hacker News new | ask | show | jobs
by pama 381 days ago
Please read the DeepSeek analysis of their API service (linked in this article): they have 500% profit margin and they are cheaper than any of the US companies serving the same model. It is conceivable that the API service of OpenAI or Anthropic have much higher profit margins yet.

(GPUs are generally much more cost effective and energy efficient than CPU if the solution maps to both architectures. Anthropic certainly caches the KV-cache of their 24k token system prompt.)

3 comments

That claim actually gives me pause. It reminds me of an idea from Zero to One by Peter Thiel - that real monopolies like to appear as a small fish in a very big pond, while tiny players try to appear as a monopoly.

So when I see a company bragging about "500% profitability," I can’t help but wonder if they’re even profitable at all.

I imagine pretty much none of them are profitable in the real accounting sense. However, if they all turned off their free plans -- they'd be insanely profitable.
Please read their report. There is no bragging. It just tries to document performance and clarify a misconception. The concept that LLM inference may not be profitable or may be energy inefficient has been a constant song of misinformation for reasons that I dont understand. DeepSeek does indeed pretend to be of similar quality to others, but the work of their relatively small team is truly outstanding. As per a parallel thread, their result has by now been almost replicated by the sglang team. Link here: https://lmsys.org/blog/2025-05-05-large-scale-ep/
Every LLM provider caches their KV-cache, it's a publicly documented technique (go stuff that KV in redis after each request, basically) and a good engineering team could set it up in a month.
Are you saying if I ask a prompt "foo" and then a month later another user asks "foo" then it retrieves a cached value?
No, the key value cache is the context in a way the model can read it.
With all due respect to Deepseek, I would take their numbers with grain of salt, as they might as well be politically motivated.
Any more politically motivated than a model from anywhere else?
The current version of sglang allows inference with the R1 model at a cost that is very close to the rate that DeekSeep claimed (using H100s, not exactly the DeepSeek compute). Their claim is almost validated by replication at this point so there is nothing left to take with a grain of salt other than the possibility that there exists potentially an even higher margin than what they claimed if one were to optimize for modern NVidia hardware.
is that better or worse than commercially motivated?
commercial motivatation needs to show eventual profit to be sustainable, while political does not.

though at the outset (pre-profit / private) it's hard to say there's much difference.

> though at the outset (pre-profit / private) it's hard to say there's much difference.

I think this is the tough part, we’re at the outset still.

Also, a political investment could could be sustainable, in the sense that China might decide they are fine running Deepseek at a loss indefinitely, if that’s what’s going on (hypothetically. Actually I have never seen any evidence to suggest Deepseek is subsidized, although I haven’t gone looking).

Also, solar panel dumping as a quite successful example (on many, many fronts).