|
|
|
|
|
by PeterisP
3017 days ago
|
|
DL models don't often get stuck at local optima. In theory, they could be vulnerable to that, but in practice they are not, it simply doesn't happen in most practical supervised learning applications. I'm not up to date on theoretical research about this topic, but as far as I recall there are some interesting demonstrations on realistic problems showing that all the different "local" optima resulting from different random initializations are actually all "connected", i.e. there exists a nondecreasing route how you can get from a worse "local optimum" to the better one, so in reality it's not a local optimum, it's just that it's nontrivial to find the path to a better optimum if the space is very highdimentional. |
|