|
|
|
|
|
by K0balt
505 days ago
|
|
If they bought them outright, they might have paid 60m, (GPU only) . After infrastructure, maybe 100M. Calling the training load for DeepSeek 6% of the value of that cluster seems generous. It probably used less of the recoverable value than that. |
|
I think the salient point here is that the "price to train" a model is a flashy number that's difficult to evaluate out of context. American companies list the public cloud price to make it seem expensive; Deepseek has an incentive to make it sound cheap.
The real conclusion is that world-class models can now be trained even if you're banned from buying Nvidia cards (because they've already proliferated), and that open-source has won over the big tech dream of gatekeeping the technology.