Hacker News new | ask | show | jobs
by raincole 504 days ago
> It doesn't matter now as deepseek has shown

People keep saying that DeepSeek R1's training cost is just $5.6M. Where is the source?

I'm not even asking for the proof. Just the source, even a self-claimed statement. I've read the R1's paper and it doesn't say the number of $5.6M. Is it somewhere in DeepSeek's press release?

Google just gives me a lot of medium articles and journalist sites. It sounds awfully like a number made up by some analyst and got parroting around. I've even seen people on X saying DeepSeek is "lying", while I can't even find what the exact DeepSeek's claim is.

2 comments

It's in the DeepSeek V3 paper, not the R1 paper. https://arxiv.org/html/2412.19437v1#abstract

"assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M."

Note that's for V3, the base model; we don't know how much extra R1 cost to train.

I see. Thank for the source.

So all the claims of DeepSeek R1's cost [0] is indeed bullshit parroted around...

[0]: https://www.google.com/search?q=deepseek+r1+training+cost

Not really; R1 is post-training on top of V3, which is considerably cheaper than training V3 itself. You can see this in the existence of multiple reproductions of the RL training technique by much smaller labs: https://hkust-nlp.notion.site/simplerl-reason
any source?

CNBC: https://noagendaassets.com/enc/1737931632.132_cnbctechceosso...

I don't wanna do all the work, folks.