Hacker News new | ask | show | jobs
by espadrine 3313 days ago
> people could generate arbitrarily many self-play games

I have doubts. Their TPU design may be a large factor into making matches at this level within the time limits. And at this point, some implementation details might hook into Google-specific libraries that require the ability to spawn processes in thousands of servers, which past blog posts[0] have hinted at.

[0]: https://deepmind.com/blog/decoupled-neural-networks-using-sy...

2 comments

There might be some hard to release infrastructure code for the MCTS part, certainly, but the model on its own should be a standard TF CNN model and highly competitive (and people can write their own MCTS wrapper, it's not that complex an algorithm). Nothing in the AG paper or statements since has hinted at using anything as exotic as synthetic gradients* and there is no reason to use synthetic gradients in AG. (In RL applications the NNs are generally small because there's so little supervision from the rewards so a large NN would overfit grossly; a NN so large as to require synthetic gradients to be split across GPUs would be simply catastrophicly bad. Plus, the input of a 19x19 board, a few planes of metadata, and other details encapsulating the state is small compared to many applications like image labeling, further reducing the benefits of size. Silver has said AG is now 40 layers but that's not much compared to the 1000-layer Resnet monsters and even those 40 layers are probably going to be thin layers, since it's the depth which provides more serial computation equivalence, not width, making for a model with relatively few parameters overall.)

* I find synthetic gradients super cool and I've been reading DM papers closely for hints of its use anywhere and have been disappointed how the idea doesn't appear to be going anywhere. The only followup so far has been https://arxiv.org/abs/1703.00522 which is more of a dissection and further explanation of the original paper than an extension or application.

They could just release the trained nets and let us re-scale the code. Even without a large MCTS it is still powerful.