| I don't for a minute believe Deepseek v3 was built with a $6M rental. Their paper (arxiv 2412:1947) explains they used 2048 H800s. A computer cluster based on 2048 GPUs would have cost around $400M about two years ago when they built it. (Give or take, feel free to post corrections.) The point is they got it done cheaper than OpenAI/Google/Meta/... etc. But not cheaply. I believe the markets are overreacting. Time to buy (tinfa). |
"Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data."
V3 was released a bit a month ago, V3 is not what took the world by storm but R1. The price everyone is talking about is the price for V3.