Hacker News new | ask | show | jobs
by K0balt 505 days ago
If they bought them outright, they might have paid 60m, (GPU only) . After infrastructure, maybe 100M.

Calling the training load for DeepSeek 6% of the value of that cluster seems generous. It probably used less of the recoverable value than that.

1 comments

Electricity in China, even at residential rates, is 1/10th the cost it is in CA.

I think the salient point here is that the "price to train" a model is a flashy number that's difficult to evaluate out of context. American companies list the public cloud price to make it seem expensive; Deepseek has an incentive to make it sound cheap.

The real conclusion is that world-class models can now be trained even if you're banned from buying Nvidia cards (because they've already proliferated), and that open-source has won over the big tech dream of gatekeeping the technology.