|
|
|
|
|
by logicchains
504 days ago
|
|
It's in the DeepSeek V3 paper, not the R1 paper. https://arxiv.org/html/2412.19437v1#abstract "assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M." Note that's for V3, the base model; we don't know how much extra R1 cost to train. |
|
So all the claims of DeepSeek R1's cost [0] is indeed bullshit parroted around...
[0]: https://www.google.com/search?q=deepseek+r1+training+cost