Hacker News new | ask | show | jobs
by ipsum2 3313 days ago
Congratulations to Deepmind and Google for this tremendous achievement.

However, it is disappointing that the code and model will not be released publicly after Alphago finishes competitive play. It's one thing to say that an apple, once dropped, will fall to the ground, but another to describe its motion as 1/2at^2 + vt.

4 comments

They did announce that they would release a teaching tool which will show AlphaGo's analysis of Go positions, as well as the paper explaining how to build your own.

Not only do you have the principle and the formula behind it, but also a little physics simulator tool! At this point, it is hard to complain.

> At this point, it is hard to complain.

Actually, it's very easy to complain. If they released the model, people could generate arbitrarily many self-play games instead of depending on DM to release 50, could create arbitrarily many tools using the model instead of depending on DM to create and maintain a single tool, and could verify the results of training a clone based on even sketchy descriptions of the methods instead of depending on DM releasing a detailed enough whitepaper and then guessing at whether a reimplementation is competitive or not. DM is only being 'generous' if you ignore how releasing the model is easier for them and superior for us in every way.

> people could generate arbitrarily many self-play games

I have doubts. Their TPU design may be a large factor into making matches at this level within the time limits. And at this point, some implementation details might hook into Google-specific libraries that require the ability to spawn processes in thousands of servers, which past blog posts[0] have hinted at.

[0]: https://deepmind.com/blog/decoupled-neural-networks-using-sy...

There might be some hard to release infrastructure code for the MCTS part, certainly, but the model on its own should be a standard TF CNN model and highly competitive (and people can write their own MCTS wrapper, it's not that complex an algorithm). Nothing in the AG paper or statements since has hinted at using anything as exotic as synthetic gradients* and there is no reason to use synthetic gradients in AG. (In RL applications the NNs are generally small because there's so little supervision from the rewards so a large NN would overfit grossly; a NN so large as to require synthetic gradients to be split across GPUs would be simply catastrophicly bad. Plus, the input of a 19x19 board, a few planes of metadata, and other details encapsulating the state is small compared to many applications like image labeling, further reducing the benefits of size. Silver has said AG is now 40 layers but that's not much compared to the 1000-layer Resnet monsters and even those 40 layers are probably going to be thin layers, since it's the depth which provides more serial computation equivalence, not width, making for a model with relatively few parameters overall.)

* I find synthetic gradients super cool and I've been reading DM papers closely for hints of its use anywhere and have been disappointed how the idea doesn't appear to be going anywhere. The only followup so far has been https://arxiv.org/abs/1703.00522 which is more of a dissection and further explanation of the original paper than an extension or application.

They could just release the trained nets and let us re-scale the code. Even without a large MCTS it is still powerful.
DeepMind's mission is to build AGI. I think it's probably good if they have a buffered lead on all other efforts. That concern probably weighs on decisions about releasing code.

The rationale for why a buffer would be good is described by Demis Hassabis here: https://youtu.be/h0962biiZa4?t=11m24s

...the main points are: there may be safety considerations along the way that are costly. More "capitalistic" organizations may decide to shortcut those costs because of the winner-take-all scenario. DeepMind is at least nominally very committed to safety.

Releasing AlphaGo's source code would probably reduce DeepMind's buffer, which in theory, would also reduce safety.

That would require some radically inconsistent thinking on their part. DM does occasionally release source code and trained models for other things, and the arms race logic (https://www.fhi.ox.ac.uk/wp-content/uploads/Racing-to-the-pr...) would even more strongly argue for not releasing anything, even research (they're privately owned, they don't have to publish squat), and especially not running stunts like the AlphaGo tournament which cost millions of dollars in order to terrify and impress competitors and heat up the arms race.

A more parsimonious explanation is simply that it's great PR to maintain rigid control over the family jewels and dribble out occasional sample games and bits and pieces while pretending to be generous. (No one has ever accused Hassabis of being bad at PR or not knowing how to milk the media.)

I am constantly amazed what Google shares. They are a company with shareholders to be fair.
"We plan to publish one final academic paper later this year that will detail the extensive set of improvements we made to the algorithms’ efficiency and potential to be generalised across a broader set of problems."

Should be enough, no?

Depends what you look for. Most of ML papers do not disclose weights/models that they used or all details needed to make fully reproducible solution. Doesn't seem like this will change this time.
There are other Go playing programs and they've apparently improved a lot by applying ideas from the original AlphaGo paper. It seems reasonable to assume they will improve more based on ideas from the new paper too, and probably surpass AlphaGo before too long.

(Similarly, Deep Blue was dismantled but chess engines continued to evolve.)

Though if they released the code Tencent would incorporate it in their rival so I can see the argument for delaying a bit.
If it's retired from competitive play, it no longer has a rival.
They're also publishing later in the year with all the details.
Publications never have "all the details".