Hacker News new | ask | show | jobs
by barfbagginus 734 days ago
Montecarlo Tree Search (MCTS) would be very ideal for this situation. Since the tree depth is really low, you would not need a neural network estimator. You would just load the entire game tree, and walk randomly through it, updating visit counts. The walk would be biased by the visit counts, and the biases would then converge to scores for each position.

See the following for a really nice tutorial for a slightly more advanced but more technically correct algorithm, Monte Carlo graph search (MCGS). This exploits the fact that some nodes in the game tree might are identical positions on the board and can be merged.

For your setup he could easily do either one, but the graph search might give you more mileage in the future:

github.com/lightvector/KataGo/blob/master/docs/GraphSearch.md

Once your scores have converged on the entire game tree, you can print out a crib sheet visually showing each position and the correct move. That might be the closest we can get to a human executable strategy. But the crib sheet might have strategic principles or hard rules that humans can identify

2 comments

I implemented expectiminimax in the browser which allows for a fairly strong AI player: https://keshav.is/coding/pt3/

I found that once you naively search the game tree beyond a depth of 8, it more or less converges on a particular choice. The presence of a neutral outcome (i.e. neither player claims the selected square) means the tree depth is technically infinite, but feasible to search thoroughly once the first few moves have been played.

> load the entire game tree, and walk randomly through it

Can't you just multiply all the percentages and just get an expected value for each field? Why the random walk? Can't you just calculate this exhaustively?

I was thinking the same ΓΈ, but does it work with the neutral field and changing players in the mix?
I mean there will be non-zero chance that the game could go on forever.
Read the article, my friend. Then you'll see what is so magical about the random walk algorithm - namely, it is easier to implement then other tree evaluation algorithms!

Of course, you can use a number of algorithms to calculate the value, and if you beat me to the punch and it's correct, how about this I'll buy you a burger. But which specific algorithm are you proposing, and where is its pseudocode and correctness proof?

And is it simpler? If so I'll implement that instead of the random walk!

I picked the random walk algorithm because it is much easier to implement than any other game tree evaluation algorithm I know.