Hacker News new | ask | show | jobs
by togelius 2942 days ago
I had to make a few simplifications to spell out the differences clearly and avoid making the text infinitely long. It's true that most current gradient descent algorithms are stochastic because they are computed in batch mode, and that sophisticated evolution strategies approximate the gradient. I still think the differences are significant, in that evolution updates less often and the direction of the update is much less (if at all) dependent on the feedback.

Now, your point about to what extent this is really about neural networks is a good one. Could a network learn F=ma, even if we could not interpret it? Maybe. With the right data, represented the right way.