|
|
|
|
|
by currymj
2733 days ago
|
|
As others have said, you don't actually want the global optimum of a neural network because that would be terrible overfitting. There is some evidence that architectural tricks (like ResNet) that empirically help performance are making the loss landscape "more convex", though. https://arxiv.org/pdf/1712.09913.pdf |
|