Hacker News new | ask | show | jobs
by comboy 3313 days ago
> IIRC, the version without tree search beat the full version 25% of the time.

That would be amazing but it seems hard to believe. Any references?

I found this (which is also impressive):

    AlphaGo team then tested the performance of the policy 
    networks. At each move, they chose the actions that were 
    predicted by the policy networks to give the highest 
    likelihood of a win. Using this strategy, each move took 
    only 3 ms to compute. They tested their best-performing 
    policy network against Pachi, the strongest open-source 
    Go program, and which relies on 100,000 simulations of 
    MCTS at each turn. AlphaGo's policy network won 85% of 
    the games against Pachi! 
1. https://www.tastehit.com/blog/google-deepmind-alphago-how-it...

2. https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf

2 comments

I believe I was remembering this from wikipedia:

>In a similar matchup, AlphaGo running on multiple computers won all 500 games played against other Go programs, and 77% of games played against AlphaGo running on a single computer.

But the full version of AlphaGo that runs on thousands of computers is much stronger than that, so I was mistaken.

Still, the fact that the non-distributed version is so strong even without tree search is pretty amazing. It beat all existing Go playing programs a majority of the time. And with algorithmic advances and more training it may eventually catch up to best human players.

I don't know about the 25% figure, but the original Alpha GO paper mentioned that their best solution is a hybrid approach between neural nets and MCTS. However, the system can beat the best Go bots out there without doing MCTS and relying solemnly on the policy/value networks, which I think is truly amazing.