| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by currymj 2733 days ago
	As others have said, you don't actually want the global optimum of a neural network because that would be terrible overfitting. There is some evidence that architectural tricks (like ResNet) that empirically help performance are making the loss landscape "more convex", though. https://arxiv.org/pdf/1712.09913.pdf