Hacker News new | ask | show | jobs
by jules 3754 days ago
It would be great if his co-commentator was a computer scientist who is knowledgeable about AlphaGo's algorithm.
1 comments

Indeed, I wish someone could talk about how the value/policy thing works.
As I understand it, the value network takes the place of the heuristic for scoring a given board layout, and the policy network takes the place of the heuristic for ordering moves from most to least promising.

When searching the game tree, at each ply the most promising N moves are examined (as determined by the policy network) and leaves of the game tree are scored by the value network.