| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jkam 2775 days ago
	What kind of work are you referring to when you say higher-order SGD may _now_ be feasible for deep learning? I only find results that try to approximate second order information.

1 comments

bmc7505 2767 days ago

Not sure what you mean. The paper above claims 1000x speedups for computing second-order derivatives. Have not tested their claims, but was speculating that such an improvement, if true, would make computing hessians for small networks fesiable. This is what I am referring to.

link