Hacker News new | ask | show | jobs
by fxtentacle 268 days ago
In this paper, each iteration has to be slower. Because they need to calculate both their new method (which may be faster) and also the traditional method (because they need a float gradient). And old+new will always be slower than just old.