| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pseudonom- 1085 days ago
	There are other mechanisms for dealing with vanishing and exploding gradients. I (maybe wrongly?) think of batch normalization as being most distinctively about fighting internal covariate shift: https://machinelearning.wtf/terms/internal-covariate-shift/