| > AlphaGo essentially baked good movies into value and policy network by playing millions of times. I don't think that's a very good description of how AlphaGo was trained at all; you're essentially saying it merely overfits the training set, yet it clearly generalizes rather well to unseen board situations and still evaluates them sucessfully. No machine learning system would be found usefull if all it could do is merely memorize the training data. Re the use of deep reinforcement learning, well for one the role of reinforcement learning in the first version of AlphaGo, the one described in the Nature paper was rather limited, and a small part of its training; it just made a ~3d KGS policy network into a ~5d KGS bot, and used to generate a training sample for the value net. If we had enough recorded human games to train the value net directly, that'd be an unnecessary step anyhow. And you could create such a training set w/o reinforcement learning since there are pure monte carlo bots stronger than 5d KGS - but that'd be far more computationally expensive. But its still not really true that there aren't obvious applications of deep reinforcement learning - indeed robotics is one promising application, and that seems rather relevant. this paper initially demonstrated an impressive improvement in manipulative tasks, and you can prob follow its numerous citations for newer stuff: http://arxiv.org/abs/1504.00702 I do agree that this exact architecture in AlphaGo prob doesn't have applications beyond just teaching us how to play go better; it seems too specialized. I believe they mean it in just the vaguest possible sense; that the kind of deep algorithms demonstrating incredible performance in AlphaGo have diverse applications; but this should not come as a surprise to anyone even loosely following what people have done with deep learning in the past couple of years anyhow. |
Go has deep strategy, but it is very well defined in terms of what can and cannot be done and those rules are not particularly complex. Power grids in contrast are far more complex. There are thousands of rules, but also many more thousands of unwritten assumptions and case-by-case analysis. A final issue is that there exist unsolved and unrecognized problems.
The last AI winter (deep learning is just the latest rebrand) came from researchers overstating their accomplishments and making promises about general intelligence that could not be kept. Any claim about anything that requires general intelligence in the near future is undoubtedly overpromising.