Hacker News new | ask | show | jobs
by shenberg 499 days ago
The DeepSeek v3 model had a net training cost of >$5m for the final training run, the paper lists over 100 authors[1], meaning highly-paid engineers. This is also one of a sequence of models (v1, v2, math, coder) trained in order to build the institutional knowledge necessary to get to the frontier , and this ends up still far above the $10m mark. It's hardly a "trio of super-smart engineers".

[1] https://arxiv.org/abs/2412.19437v1

2 comments

"Final" run... it took around half a billion - to a billion to get there.

https://arstechnica.com/ai/2025/01/why-the-markets-are-freak...

Incidentally, Altman's comments were in response to a question a out about a hypothetical startup with $10m. So you've made the argument even more cogent.