Hacker News new | ask | show | jobs
by rwilson4 2724 days ago
SVMs are a convex optimization problem, which means any local optimum is also globally optimal. In practice that leads to fast, provably optimal solutions.

With deep learning, the problem is non-convex, so the results are locally optimal but not necessarily globally optimal. We get around this with different techniques such as randomized initial weights, but with few exceptions, non-convex problems are NP-complete. Therefore there is no polynomial-time algorithm for finding the global optimum in deep learning unless P==NP.

I occasionally hear people claim deep learning finds the globally optimal solution, but that’s just not correct (or they mean they are doing an exhaustive search e.g. branch and bound, which has exponential worst case run time and isn’t practical).

That said, locally optimal solutions are often “good enough” for practical problems.

1 comments

Its a small mercy that one does not find the global minimum of DNNs. Given their number of parameters and no explicit regularization term, they would overfit viciously. The inability to reach the global and the jiggling around imposed by stochastic gradient descent acts as implicit regularization