|
|
|
|
|
by suryabhupa
3744 days ago
|
|
The idea is that even there's a policy network that is able to decide at some point what the best possible move is, the tree search is done to refine this choice and to "evaluate" it. This is why a value network is derived from policy network and is used in conjunction with MCTS to make sure that the moves AlphaGo picks are good ones. |
|