|
|
|
|
|
by HarHarVeryFunny
501 days ago
|
|
When people are talking about $100M-$1B frontier model training runs, then obviously efficiency matters! Sure training cost will go down with time, but if you are only using 10% of the compute of your competition (TFA: DeepSeek vs LLaMa) then you could be saving 100's of millions per training run! |
|