| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gwern 3314 days ago
	> At this point, it is hard to complain. Actually, it's very easy to complain. If they released the model, people could generate arbitrarily many self-play games instead of depending on DM to release 50, could create arbitrarily many tools using the model instead of depending on DM to create and maintain a single tool, and could verify the results of training a clone based on even sketchy descriptions of the methods instead of depending on DM releasing a detailed enough whitepaper and then guessing at whether a reimplementation is competitive or not. DM is only being 'generous' if you ignore how releasing the model is easier for them and superior for us in every way.

2 comments

espadrine 3314 days ago

> people could generate arbitrarily many self-play games

I have doubts. Their TPU design may be a large factor into making matches at this level within the time limits. And at this point, some implementation details might hook into Google-specific libraries that require the ability to spawn processes in thousands of servers, which past blog posts[0] have hinted at.

[0]: https://deepmind.com/blog/decoupled-neural-networks-using-sy...

link

gwern 3314 days ago

There might be some hard to release infrastructure code for the MCTS part, certainly, but the model on its own should be a standard TF CNN model and highly competitive (and people can write their own MCTS wrapper, it's not that complex an algorithm). Nothing in the AG paper or statements since has hinted at using anything as exotic as synthetic gradients* and there is no reason to use synthetic gradients in AG. (In RL applications the NNs are generally small because there's so little supervision from the rewards so a large NN would overfit grossly; a NN so large as to require synthetic gradients to be split across GPUs would be simply catastrophicly bad. Plus, the input of a 19x19 board, a few planes of metadata, and other details encapsulating the state is small compared to many applications like image labeling, further reducing the benefits of size. Silver has said AG is now 40 layers but that's not much compared to the 1000-layer Resnet monsters and even those 40 layers are probably going to be thin layers, since it's the depth which provides more serial computation equivalence, not width, making for a model with relatively few parameters overall.)

* I find synthetic gradients super cool and I've been reading DM papers closely for hints of its use anywhere and have been disappointed how the idea doesn't appear to be going anywhere. The only followup so far has been https://arxiv.org/abs/1703.00522 which is more of a dissection and further explanation of the original paper than an extension or application.

link

visarga 3313 days ago

They could just release the trained nets and let us re-scale the code. Even without a large MCTS it is still powerful.

link

rojobuffalo 3313 days ago

DeepMind's mission is to build AGI. I think it's probably good if they have a buffered lead on all other efforts. That concern probably weighs on decisions about releasing code.

The rationale for why a buffer would be good is described by Demis Hassabis here: https://youtu.be/h0962biiZa4?t=11m24s

...the main points are: there may be safety considerations along the way that are costly. More "capitalistic" organizations may decide to shortcut those costs because of the winner-take-all scenario. DeepMind is at least nominally very committed to safety.

Releasing AlphaGo's source code would probably reduce DeepMind's buffer, which in theory, would also reduce safety.

link

gwern 3313 days ago

That would require some radically inconsistent thinking on their part. DM does occasionally release source code and trained models for other things, and the arms race logic (https://www.fhi.ox.ac.uk/wp-content/uploads/Racing-to-the-pr...) would even more strongly argue for not releasing anything, even research (they're privately owned, they don't have to publish squat), and especially not running stunts like the AlphaGo tournament which cost millions of dollars in order to terrify and impress competitors and heat up the arms race.

A more parsimonious explanation is simply that it's great PR to maintain rigid control over the family jewels and dribble out occasional sample games and bits and pieces while pretending to be generous. (No one has ever accused Hassabis of being bad at PR or not knowing how to milk the media.)

link

johnsmith21006 3310 days ago

I am constantly amazed what Google shares. They are a company with shareholders to be fair.

link