| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by letlambda 3168 days ago

The policy network is a function from board states to a scoring of moves. The policy network with the greedy heuristic, ie pick the highest rated move with no explicit look ahead method, plays at a high amateur level.

This was... unexpectedly good.

It effectively reduces the branching factor of Go from the number of moves available, to the number of moves actually worth considering.