| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jeffchao 695 days ago
	This is very impressive, though an adjacent question — does anyone know roughly how much time and compute cost it takes to train something like the 405B? I would imagine with all the compute Meta has that the moat is incredibly large in terms of being able to train multiple 405B-level morels and compete.

1 comments

sva_ 695 days ago

30.84M H100 compute-hours, according to the model card

https://github.com/meta-llama/llama-models/blob/main/models/...

link

Escapado 695 days ago

Interestingly that’s less energy than the mass energy equivalent of one gram of matter or roughly 5 seconds worth of the worlds average energy consumption (according to wolfram alpha). Still an absolute insane amount of energy, as in about 5 million dollars at household electricity rates. Absolutely wild how much compute goes into this.

link