| Monte Carlo Tree Search ( Random playout ) is currently the best computer strategy for evaluating a Go position. This is likely due to the way Go works , random playout provides a rough estimate of who controls what territory ( this is how Go is scored ). Recently two deep-learning papers showed very impressive results. http://arxiv.org/abs/1412.3409 http://arxiv.org/abs/1412.6564 The neural networks were tasked with predicting what move an expert would make given a position. The MCTS takes a long time 100,000 playouts are typical - once trained the neural nets are orders of magnitude faster. The neural nets output a probability for each move ( that an expert would make that move ) - all positions are evauluated in a single forward pass. Current work centers around combining the two approaches, MCTS evaluates the best suggestions from the neural net. Expert Human players are still unbeatable by computer Go. |
It learns to master level from self-play.
http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Applications_fil...
also his lecture bootstrapping from tree based search
http://www.cse.unsw.edu.au/~cs9414/15s1/lect/1page/TreeStrap...
and Silver's overview on board game learning
http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching_files/g...