|
|
|
|
|
by chestervonwinch
3252 days ago
|
|
In short, Newton's method uses second order derivative information in the search direction, while gradient descent only uses first order derivative. In between, there are "quasi-newton" methods which include generalizations of the "secant method". I should also mention that there are all sorts of ad-hoc approaches for attempting to increase the convergence rate of gradient descent, e.g., pre-conditioning, "momentum" terms, etc. |
|