| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwthrowuknow 691 days ago
	Not going to work for training from scratch which is what the author is doing.

1 comments

192GByte of RAM are not enough to train 405B models. Reflection 70B requires 140GByte of RAM in fp16, 405 would need ~810Gbyte of RAM.

Pretty sure he said he’s inferencing llama3 405 and training his own custom model from scratch. He didn’t say how big his custom model will be.