|
|
|
|
|
by PartiallyTyped
1069 days ago
|
|
First order optimizers have trouble because they fall into local minima, however, in practice things are different. When your parameter space is in the order of billions, for all practical purposes, there is always a direction of descent. More over, local minima seem to be rather close to the global minima. |
|