| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by heavenlyblue 807 days ago
	They don't do global optimisation of all layers at the same time, instead training all layers independently of each other.