Hacker News new | ask | show | jobs
by DoctorOetker 2778 days ago
The paper states for the main result:

>In this section, we show gradient descent with a constant positive step size converges to the global minimum with a linear rate.

This is rather ambiguous: it sounds like it guarantees "it WILL converge to the global minimum, btw at a linear rate" but I suspect they are really saying "IF it converges to the global minimum, THEN it will converge at a linear rate" in a way to purpousefully sound like the first statement.

could you comment on if GD does or does not find the global minimum of the integer factorization cost function in the following comment?

https://news.ycombinator.com/item?id=18439287

1 comments

Another sin they have is that they write "with high probability" in the theorem but they do not strictly define that in the main section of the paper. If you look into appendix you'll see that they guarantee the probability 1 - \delta and delta affects the convergence. It means that if you add many many parameters then you just need a couple attempts to converge well. So "IF it converges" is becoming much better "VERY OFTEN it converges". Sorry, no real passion to read the paper in details, so this is just an intuition from a quick look.