Hacker News new | ask | show | jobs
by seanharr11 3250 days ago
Thanks for the feedback, I co-worker just jabbed me with regarding the log property mistake also...

As to the motivation for the correct step: can you point me to a resource that explains this? Not sure I follow...

2 comments

> As to the motivation for the correct step: can you point me to a resource that explains this? Not sure I follow...

You write an equation involving division by the gradient. This is an illegal operation (one cannot divide by a vector), and your final recipe doesn't do it. As far as I can tell, you are writing down the incorrect, illegally-vector-inverting formula as motivation for the correct formula involving the (inverse of the) Hessian. All I am suggesting is that you say explicitly something like "Of course, this formula as written is not literally correct; one cannot actually divide by a vector. The correct procedure is explained below."

(Incidentally, speaking of inverses, another poster (https://news.ycombinator.com/item?id=14881265) has mentioned that it may be a bit confusing to speak of the inverse of a matrix rather than the reciprocal, since (as I interpret that other poster's point) the reciprocal of a matrix is just its inverse. I might prefer to say something like "We write $H_{\ell(\theta)}^{-1}\nabla\ell(\theta)$ rather than $\frac{\nabla\ell(\theta)}{H_\ell(\theta)}$ to emphasise that we are inverting a matrix, not a scalar, so that the order of multiplication matters.")

Ahh I got it. Understood, definitely worth clarifying, will update. Thanks.
Sorry to say it, but I got the impression that the author was unaware it was nonsensical, not that it was a clever motivation.
Bishop has a nice treatment of Newton's method in "Pattern recognition and machine learning". Good book to have on your shelf of you are learning this stuff.