Hacker News new | ask | show | jobs
by a1369209993 1349 days ago
> why can't we take a big step and be at the end in one step.

Because we're doing gradient descent. (No, seriously, it's turtles all the way down (or all the way up, considering we're at a higher level of abstraction here).)

We're trying to (quickly, in less than 100 steps) descend a gradient through a complex, irregular and heavily foggy 16384-dimensional landscape of smeared, distorted, and white-noise-covered images that kinda sorta look vaguely like what we want if you squint (well, if the neural network squints, anyway). If we try to take a big step, we don't descend the gradient faster; we fly off in a mostly random direction, clip through various proverbial cliffs, and probably end up somewhere higher up the gradient than we started.