| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by MiroF 2384 days ago
	Yep, and I'm not saying its a bad approach! Just trying to answer "why is that any worse than, say, starting with randomly initialized weights in general?" wrt gradient passing I'm not sure I'd agree with the "noisy" characterization - which to me implies stochasticity-, whereas this is just blocking off the flow of gradient information to save memory.