|
|
|
|
|
by YetAnotherNick
317 days ago
|
|
Deepseek main run costed $6M. qwen3-30b-a3b probably would cost few $100Ks, which is ranked 13th. GPU cost of the final model training isn't the biggest chunk of the cost and you can probably replicate results of models like Llama 3 very cheaply. It's the cost of experiments, researchers, data collection which brings overall cost 1 or 2 order of magnitude higher. |
|