|
|
|
|
|
by muppet_frog
2069 days ago
|
|
This paper makes the points that it's the saddles and not local minima that are the problem:
https://arxiv.org/abs/1406.2572
It was the basis for adding 'momentum' to optimizers - so that you could skate across the saddles. |
|