| HN Mirror

Can you define what you mean by “theoretical insight”? It’s true that AlphaGo was built using previously existing techniques (supervised learning, large dataset for training, reinforcement learning and monte carlo tree search). But if you consider something to not be a breakthrough because it does not literally introduce a novel fundamental technique, you have a very narrow view of research (in my opinion).

Here are a few points to consider:

1. The combination of the aforementioned techniques in AlphaGo was non-standard. Reinforcement learning bootstrapped supervised learning, before passing a value function to the monte carlo tree search.

2. AlphaGo represents a new achievement in solving perfect information games. The research team has moved on to Starcraft, which is not perfect information, but they didn’t try to tackle that before conquering a complex perfect knowledge game first.

3. AlphaGo’s research team improved upon the original AlphaGo with a novel algorithm for self-learning and mastering games using minimal policy improvement. The new AlphaGo Zero does not utilize human training data or supervised learning, and it was capable of defeating the original AlphaGo 100-0.

Beyond self-play, I think that AlphaGo’s methodologies can generalize to combinatorial search problems even if they don’t generalize to broader domains like partially observed games or robotics.