|
|
|
|
|
by JustFinishedBSG
3124 days ago
|
|
> I think you are significantly underestimating the theoretical innovations contributed to the field every time these models are substantially improved. I think you are overestimating, there isn't a single interesting theoretical insight in AlphaGo's papers. |
|
Here are a few points to consider:
1. The combination of the aforementioned techniques in AlphaGo was non-standard. Reinforcement learning bootstrapped supervised learning, before passing a value function to the monte carlo tree search.
2. AlphaGo represents a new achievement in solving perfect information games. The research team has moved on to Starcraft, which is not perfect information, but they didn’t try to tackle that before conquering a complex perfect knowledge game first.
3. AlphaGo’s research team improved upon the original AlphaGo with a novel algorithm for self-learning and mastering games using minimal policy improvement. The new AlphaGo Zero does not utilize human training data or supervised learning, and it was capable of defeating the original AlphaGo 100-0.
Beyond self-play, I think that AlphaGo’s methodologies can generalize to combinatorial search problems even if they don’t generalize to broader domains like partially observed games or robotics.