| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by t-vi 1021 days ago
	> Is avoiding CF potentially just a matter of sheer scale ? My intuition would be that you get more orthogonal directions to the gradient (of previous samples) if you have larger model.