|
|
|
|
|
by JadeNB
3251 days ago
|
|
Since the author is reading, a few small typos, followed by one slightly more substantial comment: 'simgoid' should be 'sigmoid' (S-shaped); `x y = log(x) + log(y)` should be `log(x y) = log(x) + log(y)`;'guarentee' should be 'guarantee'; 'recipricol' should be 'reciprocal'. I would like to see some mention of the fact that the division by the gradient is a meaningless, purely formal motivation for the correct step (inverting the Hessian) that follows. |
|
If f is a multi dimensional function f : R^n -> R^n then the derivative f'(a) is the Jacobian, and inversion becomes matrix inversion.
When we use Newton's method for minimisation of a function g we solve g'(x) = 0, so we pick f(x) = g'(x). Since the formula above contains f' we get a second derivative. The second derivative in multiple dimensions is the hessian.