Hacker News new | ask | show | jobs
by xU1ppskunDmy6oz 3251 days ago
The author has far too little mathematical understanding to be teaching anybody (that’s my impression at least). If you don’t understand Newton’s method before reading this, you won’t understand it afterwards. “A method for finding the roots of a polynomial”. Why polynomials? Does it work, always? Is it fast? Why would following the tangent repeatedly be a good idea? “We take the inverse instead of the reciprocal because it’s a matrix”. Not impressed.
3 comments

I am a programmer trying to learn math, so are my intended audience members.

That said, I should include facts related to convergence, and maybe even speed compared to SGD.

As to the reciprocal -> inverse generalization, do you have any resources you could point we towards to better understand this?

Additionally, a concrete answer to "Why would following the tangent repeatedly be a good idea?" has been hard to come by for me. I am able to visualize this, but if you have resources that explain this well please share.

In general, it’s not a good idea. And in general, Newton’s method won’t converge.

Newton’s method boils down to replacing your function by a first-order approximation. For a differentiable function, in a small neighbourhood(!), that’s a good approximation (by definition), though, and the zero of the model function will be very close to the zero of the original function (if it lies in that neighbourhood).

PS: i did not expect the poster and author to be the same person, otherwise I would’ve phrased my criticism differently. A SHOW HN would have helped.

PPS: basically the whole reciprocal/inverse confusion only arises because you start the multidimensional case from your iteration formula. If you back to its derivation, and start again from there, you can avoid that.

> In general, it’s not a good idea. And in general, Newton’s method won’t converge.

Right, but this blog post isn't about the general case of using Newton's method to find roots, it's about using Newton's method for solving logistic regression for which it is perfectly suited, though there are better methods as well, of course.

Newton's method with a line search is the go-to algorithm for convex optimisation if the dimension of the problem is not too large.
To elaborate a bit: Newton’s method is amazing. But it’s also dangerous and difficult to use. You can talk about it for a long time and use it as the centrepiece of an course on vector analysis. This treatment does not explain anything - it might as well just say “we use the magical newton method, a discussion of which is beyond the scope of this work”. I’d find that honest.
Back when I was in high school Newton's Method was typically introduced as a method to find roots of a polynomial, usually in "Algebra 2" or Trigonometry or about then. That's because there is an easy to grasp formula for it that can be used in the context of a very familiar mathematical construct (polynomials) and produced from differentiation that doesn't technically require an understanding of calc to use (as opposed to actually understand, derive, etc.). I still use it pre-calc when I tutor younger students because it's a great exploratory tool.