| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by xoroshiro 3166 days ago
	Does this mean it learns what to search? I wonder why they thought it was a good idea. I thought the whole point of MC was that pruning algorithms like the ones in chess wouldn't work for a larger search space.

2 comments

letlambda 3165 days ago

The policy network is a function from board states to a scoring of moves. The policy network with the greedy heuristic, ie pick the highest rated move with no explicit look ahead method, plays at a high amateur level.

This was... unexpectedly good.

It effectively reduces the branching factor of Go from the number of moves available, to the number of moves actually worth considering.

link

shmageggy 3165 days ago

That's what the policy is. Given a board state, the policy gives you a distribution over all available moves.

link