|
|
|
|
|
by cgearhart
384 days ago
|
|
In old-fashioned AI, it was generally believed that the best way to spend resources was to exactly evaluate as much of the search tree as possible. To that end, you should use lightweight heuristics to guide the search in promising directions and optimizations like alpha-beta pruning to eliminate useless parts of the search space. For finite games of perfect information like chess this is hard to beat when the search is deep enough. (For if you could evaluate the whole game tree from the start then you could always make optimal moves.) Stockfish follows this approach and provides ample evidence of the strength in this strategy. Perhaps a bit flippantly, you can think of MCTS as “vibe search”—but more accurately it’s a sampling-based search. The basic theory is that we can summarize the information we’ve obtained to estimate our belief in the “goodness” of every possible move and (crucially) our confidence in that belief. Then we allocate search time to prioritize the branches that we are most certain are good. In this way MCTS iteratively constructs an explicit search tree for the game with associated statistics that is used to guide decisions during play. The neural network does a “vibe check” on each new position in the tree for the initial estimate of “goodness” and then the search process refines that estimate. (Ask the NN to guess at the current position; then play a bunch of simulations to make sure it doesn’t lead to obvious blunders.) |
|