| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pama 381 days ago
	Please read the DeepSeek analysis of their API service (linked in this article): they have 500% profit margin and they are cheaper than any of the US companies serving the same model. It is conceivable that the API service of OpenAI or Anthropic have much higher profit margins yet. (GPUs are generally much more cost effective and energy efficient than CPU if the solution maps to both architectures. Anthropic certainly caches the KV-cache of their 24k token system prompt.)

3 comments

hedayet 381 days ago

That claim actually gives me pause. It reminds me of an idea from Zero to One by Peter Thiel - that real monopolies like to appear as a small fish in a very big pond, while tiny players try to appear as a monopoly.

So when I see a company bragging about "500% profitability," I can’t help but wonder if they’re even profitable at all.

link

withinboredom 381 days ago

I imagine pretty much none of them are profitable in the real accounting sense. However, if they all turned off their free plans -- they'd be insanely profitable.

link

pama 381 days ago

Please read their report. There is no bragging. It just tries to document performance and clarify a misconception. The concept that LLM inference may not be profitable or may be energy inefficient has been a constant song of misinformation for reasons that I dont understand. DeepSeek does indeed pretend to be of similar quality to others, but the work of their relatively small team is truly outstanding. As per a parallel thread, their result has by now been almost replicated by the sglang team. Link here: https://lmsys.org/blog/2025-05-05-large-scale-ep/

link

SEGyges 381 days ago

Every LLM provider caches their KV-cache, it's a publicly documented technique (go stuff that KV in redis after each request, basically) and a good engineering team could set it up in a month.

link

chipsrafferty 381 days ago

Are you saying if I ask a prompt "foo" and then a month later another user asks "foo" then it retrieves a cached value?

link

wkat4242 380 days ago

No, the key value cache is the context in a way the model can read it.

link

iamnotagenius 381 days ago

With all due respect to Deepseek, I would take their numbers with grain of salt, as they might as well be politically motivated.

link

jarym 381 days ago

Any more politically motivated than a model from anywhere else?

link

pama 381 days ago

The current version of sglang allows inference with the R1 model at a cost that is very close to the rate that DeekSeep claimed (using H100s, not exactly the DeepSeek compute). Their claim is almost validated by replication at this point so there is nothing left to take with a grain of salt other than the possibility that there exists potentially an even higher margin than what they claimed if one were to optimize for modern NVidia hardware.

link

WithinReason 381 days ago

is that better or worse than commercially motivated?

link

leeoniya 381 days ago

commercial motivatation needs to show eventual profit to be sustainable, while political does not.

though at the outset (pre-profit / private) it's hard to say there's much difference.

link

bee_rider 381 days ago

> though at the outset (pre-profit / private) it's hard to say there's much difference.

I think this is the tough part, we’re at the outset still.

Also, a political investment could could be sustainable, in the sense that China might decide they are fine running Deepseek at a loss indefinitely, if that’s what’s going on (hypothetically. Actually I have never seen any evidence to suggest Deepseek is subsidized, although I haven’t gone looking).

link

lazide 381 days ago

Also, solar panel dumping as a quite successful example (on many, many fronts).

link