| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hantusk 830 days ago
	Digging into the low rank structure of the gradients, instead of the weights seems like a promising direction for training from scratch with less memory requirements: https://twitter.com/AnimaAnandkumar/status/17656138151468933...

1 comments