|
|
|
|
|
by sillysaurus3
3799 days ago
|
|
they also train a mapping from the board state to a probability of how how likely it is a particular move will result in winning the game (the value of a particular move). How is this calculated? When some termination criterion is met Were these criterion learned automatically, or coded/tweaked manually? |
|
2. They just run a certain number of simulations, i.e. they compute n different branches all the way to the end of the game with various heuristics.