|
|
|
|
|
by comboy
3313 days ago
|
|
> IIRC, the version without tree search beat the full version 25% of the time. That would be amazing but it seems hard to believe. Any references? I found this (which is also impressive): AlphaGo team then tested the performance of the policy
networks. At each move, they chose the actions that were
predicted by the policy networks to give the highest
likelihood of a win. Using this strategy, each move took
only 3 ms to compute. They tested their best-performing
policy network against Pachi, the strongest open-source
Go program, and which relies on 100,000 simulations of
MCTS at each turn. AlphaGo's policy network won 85% of
the games against Pachi!
1. https://www.tastehit.com/blog/google-deepmind-alphago-how-it...2. https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf |
|
>In a similar matchup, AlphaGo running on multiple computers won all 500 games played against other Go programs, and 77% of games played against AlphaGo running on a single computer.
But the full version of AlphaGo that runs on thousands of computers is much stronger than that, so I was mistaken.
Still, the fact that the non-distributed version is so strong even without tree search is pretty amazing. It beat all existing Go playing programs a majority of the time. And with algorithmic advances and more training it may eventually catch up to best human players.