|
|
|
|
|
by gwern
2394 days ago
|
|
He's also wrong about a lot. For all his insistence on accuracy, he himself is misleading or ignorant. Like his slam of the Dartmouth project - even if you knew nothing about it, all you have to do is click through to see the claims of 'solving vision' are sheer projection and urban legend. And he's happy to make up claims out of whole cloth: for example, when he says "AlphaGo works fine on a 19x19 board, but would need to be retrained to play on a rectangular board; the lack of transfer is telling.", for which he provides precisely zero evidence, he ignores the fact that AG training works fine on a mixture of board sizes and such progressive growing/curriculum training in fact seems to accelerate training (https://arxiv.org/abs/1902.10565) and that rectangular convolutions are a thing that exist, and there would be plenty of transfer if anyone tried. If no one has tried that exact thing, it's because it would be pointless and it's obvious to everyone not named 'Marcus' that it'd work fine, being rectangular doesn't mystically stop it from working anymore than using a 13x13 rather than 19x19 board makes AG-style training stop working. (This is not the first time Marcus has claimed that something didn't work; I first realized that he doesn't actually keep up with the AI literature when I pointed out to him on Twitter that plenty of knowledge graphs were in use combined with DL, and Google Search was the biggest example of this, and he had no idea what I was talking about.) No matter what DL or DRL does, or how little his own preferred paradigm does, Marcus will never ever admit anything. AlphaGo beats humans? Well, it just copied humans. AlphaZero learns from scratch? Well, he wrote a whole paper explaining how akshully it still copies humans because the tree search encodes the rules. MuZero throws out even the tree search's knowledge of rules? Crickets and essays about 'misinformation'. |
|
Regarding the lack of transfer, yes, AlphaGo, AlphaZero and most of their variants have boards of fixed size and shape hard-coded in their architecture (as they have the types of piece moves-hard coded) and need architectural modifications and re-training before they can play on different boards or with different pieces (e.g. AlphaGo can't play Chess and Shoggi unmodified). The KataGo paper (the paper you linked) is one exception to this. Personally, I don't know others. Anyway general game-playing is a hard task and nobody claims it's solved by AlphaGo.
Regarding KataGo its main contribution is a significant reduction to the cost of training an AlpahGo variant while maintaining a competitive performance. This is very promising- after DeepBlue, creating a chess engine became cheaper and cheaper until they could run on a smartphone. We are far from that with Go computer players.
However, in the KataGo paper, major gains are claimed to come from a) game-playing specific or MCTS-specific improvements (playout cap randomisation, forced playouts and policy target pruning) or architecture-specific improvements (global pooling) or, b) domain-specific improvements (auxiliary ownership and score targets). Finally, KataGo has a few game-specific features (liberties, pass-alive regions and ladder features).
The KataGo paper itself says it very clearly. I quote, snipping for brevity:
Second, our work serves as a case study that there is still a significant efficiency gap between AlphaZero's methods and what is possible from self-play. We find nontrivial further gains from some domain-specific methods (...) We also find that a set of standard game-specific input features still significantly accelerates learning, showing that AlphaZero does not yet obsolete even simple additional tuning.
Finally, "it would obviously work so nobody tried" would make sense if it wasn't for the extremely competitive nature of machine learning research where every novel result is presented as a big breakthrough. Also, if something is obvious but never seems to make it to publication the chances are someone has tried and it didn't work as expected so they shelved the paper. We all know what happens to negative results in machine learning.