Hacker News new | ask | show | jobs
by jules 3252 days ago
The motivation for the Hessian is the same as for dividing by the second derivative. Suppose we want to solve f(x) = 0. Taylor expand f(x) around the current iterate (a)

    f(x) =~ f(a) + f'(a)(x-a)
We want f(x) = 0 so,

    0 = f(a) + f'(a)(x-a)
Multiply both sides by the inverse of f'(x):

    0 = f'(a)^-1 f(a) + x-a
So:

    x = a - f'(a)^-1 f(a)
This is the update equation for Newton's method where a is the current iterate and x is the next iterate.

If f is a multi dimensional function f : R^n -> R^n then the derivative f'(a) is the Jacobian, and inversion becomes matrix inversion.

When we use Newton's method for minimisation of a function g we solve g'(x) = 0, so we pick f(x) = g'(x). Since the formula above contains f' we get a second derivative. The second derivative in multiple dimensions is the hessian.