Hacker News new | ask | show | jobs
by oteytaud 2736 days ago
It's black-box optimization. This means that we just have an objective function, without access to derivatives or whatever other information. This is not relevant for training weights in deep learning for image classification, or other things for which the gradient works well.
1 comments

There was a recent paper from Uber, that GA works well for weights, so I wouldn't drop that area right away.
Sure GA can be great for weights as well - but mainly when gradient is unreliable. I would not use Nevergrad for training the weights of a convolutional network for image classification for example; whereas I use Nevergrad for WorldModels.
Doesn't the model Uber used begin with a bunch of convolutional layer sets, since it processes raw images?
What’s GA here?
GA stands for genetic algorithms.