| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SleekEagle 1515 days ago
	Ultimately it comes down to gradient-based descent (which is pretty magical in its own right), but what's most surprising to me is that the loss landscape is actually organized enough to yield impressive results. Obviously the difficulties of training large NNs are well-documented, but I'm surprised it's even that easy