| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cpldcpu 549 days ago

The $6M that is thrown around is from the DS V3 paper and is for the cost of a single training run for DeepSeek V3 - the base model that R1 is built on.

The number does not include cost for personell, experiments, data preparation, chasing dead ends, and most importantly, it does not include the reinforcement learning step that made R1 good.

Furthermore, it is not factored in that both R3 and V1 are build on top of an enormous amount of synthetic data the was generated by other LLMs.