Hacker News new | ask | show | jobs
by frankchn 3755 days ago
AlphaGo played 5 informal games with shorter time controls alongside the formal games against Fan Hui (the European champion) back in October. "Time controls for formal games were 1 h main time plus three periods of 30 s byoyomi. Time controls for informal games were three periods of 30 s byoyomi."

The games were played back-to-back (formal, then informal) and AlphaGo won 3-2 in the informal games compared to 5-0 in the formal ones, so I would say worse.

1 comments

The question is whether Alphago’s architecture starts hitting diminishing returns to extra processing faster than top humans is a significantly different question from whether it scales down to a blitz game worse. (Moreover, the difference between 1h main time + 3x 30s byoyomi vs. only 3x 30s byoyomi is absolutely massive.)

Deepmind engineers have stated that the “cluster” version of Alphago only beats the “single machine” version about 70% of the time. This despite the cluster version using like an order of magnitude more compute resources, presumably able to search several moves deeper in the full search tree.

My impression is that there are some fundamental weaknesses in the (as currently trained and implemented) value network, which Lee Sedol was able to exploit. If this is the case, giving the computer time to cover an extra move or two of search depth might not make a huge difference. Giving Lee Sedol twice as much time, however, would have had a significant impact on several of the games in this series, especially the last game. I strongly suspect that with a few extra minutes per move Lee Sedol would have avoided the poor trades in the late-midgame which cost him the game.