| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by solveit 619 days ago
	Practicality of reproducing training runs that cost tens of millions aside, it's hopeless. Determinism is hard enough with a single GPU, fixing a seed isn't going to be much help when training is distributed across hundreds of GPUs.