|
|
|
|
|
by tmyklebu
3093 days ago
|
|
> You can make an argument that, in high-dimensional parameter spaces, it’s hard to do better (because, gradient descent is linear in the number of dimensions). Nitpick: If you're interested in solving optimisation problems, it's very easy to do better than gradient descent. Gradient descent performs very poorly when the directional curvature of the objective function varies too much with the direction. Newton's method, quasi-Newton methods, and nonlinear conjugate gradient are some of the more ingenious, beautiful, and clean ways; there are also some dirty, hacky ways to go. It is a little bit interesting that fancier optimisation algorithms than gradient descent are unnecessary or unhelpful in some large-scale applications. |
|