Hacker News new | ask | show | jobs
by PartiallyTyped 1069 days ago
First order optimizers have trouble because they fall into local minima, however, in practice things are different.

When your parameter space is in the order of billions, for all practical purposes, there is always a direction of descent.

More over, local minima seem to be rather close to the global minima.