|
|
|
|
|
by yorwba
3055 days ago
|
|
Google simply has the computational resources to cover thousands of different hyperparameter combinations. If you don't have that, you won't ever be able to do systematic exploration, so you might as well rely on intuition and luck. |
|
But how does it work? It's enough to outpace other implementations, alright. But the model even works on a consumer machine, if I remember correctly.
I have only read a few abstract descriptions and I have no idea about deep learning specifically. So the following is more musing than summary:
They use the Monte Carlo method to generate a sparse search space. The data structure is likely highly optimized to begin with. And it's no just a single network (if you will, any abstract syntax tree is a network, but that's not the point), but a whole architecture of networks --modules from different lines of research pieced together, each probably with different settings. I would be surprised if that works completely unsupervised; after all it took months from beating go to chess. They can run it without training the weights, but likely because the parameters and layouts are optimized already, and to the point of the OP, because some optimization is automatic. I guess what I'm trying to say is, if they extracted features from their own thought process (ie. domain knowledge) and mirrored that in code, than we are back at expert systems.
PS: Instead of letting processors run small networks, take advantage of the huge neural network experts have in their head and guide the artificial neural network into the right direction. Mostly, information processing follows insight from other fields, and doesn't deliver explanations. The explanations have to be there already. It would be particularly interesting to hear how the chess play of the developers involved has evolved since and how much they actually do understand the model.