|
|
|
|
|
by fnbr
921 days ago
|
|
It’s very difficult to implement, and requires training the network to use it. I worked at DeepMind on projects that used MCTS. Even with access to the AlphaZero source code, it was very difficult to write an other implementation that got the same results as the original. |
|
> and requires training the network to use it.
I thought one of the benefits of MCTS was, if you already have your value network, then a general MCTS implementation can walk the tree of values created by that network. And so no special update to the model is necessary. But I'm probably wrong about this.
(also, it boosts my confidence to hear that even folks at DeepMind find MCTS difficult to implement :D Because I tried to implement a simple MCTS a few years back for a very small toy project. I was following a step-by-step explanation of how it worked, and even still, it was super difficult, and very prone to subtle bugs)