Hacker News new | ask | show | jobs
by _snydly 3748 days ago
Was it AlphaGo losing the game, or Lee Sedol winning it?
4 comments

Lee Sedol winning, and keeping his cool and not make any mistakes. AlphaGo, on the other hand, went bonker especially towards the end but it got into bad territory not because of silly mistakes but brilliant play by Lee Sedol.

Could it possibly be that both of the mistakes were bugs? Perhaps it suggested a non-sensical position such as (25.23, 13.15), and it was snapped to (19, 13) :D

I dont think they were bugs in a traditional sense. I think AlphaGo picked moves to try and maximize the probability of winning, and at some point that was only by the opponent making a suboptimal response. I remember reading somewhere most of it's training data is from amateur games. The model doesn't have a prior that AlphaGo is playing a professional who won't make a bad response. It probably would have resigned a lot earlier with that prior :)

Another thing to keep in mind is that AlphaGo has no "memory", so every turn it looks at the board fresh. This means if the probabilities are very close you could have it jump around a bit either due to numerical noise from floating point calculations, model errors, or just tiny differences in probability making the behavior appear erratic and quick to change "strategy".

> Perhaps it suggested a non-sensical position such as (25.23, 13.15), and it was snapped to (19, 13) :D

Alphago doesn't work like that..

According to the head of DeepMind, AlphaGo made a mistake in evaluating move 79: https://twitter.com/demishassabis/status/708928006400581632
> Mistake was on move 79, but #AlphaGo only came to that realisation on around move 87

That's cool to think of AlphaGo having "realizations"

It is one way to say that AlphaGo's value network can get wrong.
Honestly, I think that is a meaningless distinction unless AlphaGo actually broke down in the middle of the match
Given the number of errors by the AlphaGo in the last 10 minutes, probably the former.
I think Lee Sedol won the game earlier by destroying AlphaGo's territory in the center. The commentator (Michael Redmond) was quite impressed with what he did there.
Those moves look like AlphaGo had calculated a loss way ahead of time.
Sometimes its just not possible to stop the snowball rolling down hill. You cant always turn a retreat into an advance or flanking move, sometimes the first step backwards just turns into a full on rout.

I got the impression (possibly incorrectly) that AlphaGo was trying to throw curve-balls and be 'unexpected' in a way that might have 'forced' a mistake it could exploit.

Errors when already lost don't mean much.
Given the genius of LSD, probably the latter.