Hacker News new | ask | show | jobs
by AllanHoustonSt 2396 days ago
> Won't the winner be the one who just takes AlphaGo's recommended move every time without changing anything?

That's only true if AlphaGo never makes a mistake or if AlphaGo will 100% always make the better or equal decision than a human + computer at any given state of the board. I know the former certainly isn't true and I assume the latter isn't true either, but I don't know enough about Go to say for sure.

2 comments

> That's only true if AlphaGo never makes a mistake or if AlphaGo will 100% always make the better or equal decision than a human + computer at any given state of the board.

Even if AlphaGo makes mistakes, and somewhere on the board a better move can be found, you would also need the human to reliably spot it.

Eg: AlphaGo makes a move. Let's say that at least 20% of AlphaGo moves can be bettered. Is this one of them? How can you tell? Most of the time, you'll mistakenly think a move can be improved and end up playing a worse one.

But, let's make AlphaGo even more fallible. Let's say that at least 50% of AlphaGo moves can be bettered. Again, is this one of those? How can you tell? And more to the point, on the times you are wrong, are you more wrong than AlphaGo is with its mistakes? Because even if you imagine you can spot a better move than AlphaGo and pick the actual better move 50% of the time, you also need your mistaken moves to be better than AlphaGo's mistaken moves or you'll still lose.

Worst of all, you can rule out a really good ability to spot AlphaGo's mistakes already. Let's say 99% of AlphaGo's moves have a better option. If you could spot them all, you'd be beating AlphaGo regularly on your own. As no human can now beat AlphaGo, this plainly isn't true.

So it's likely that:

a) No human can reliably pick a better move than AlphaGo and/or b) No human can reliably spot a move from AlphaGo can be improved, and/or c) Human mistakes are worse than AlphaGo mistakes, so even if you could fight it up to parity you'd still lose.

Like self-driving cars, it's not enough to outsmart AI on one move, it's necessary to outsmart on AI with positive expected value over all the moves you are confident enough to weight in on.
AlphaGo (and, presuably, any AI system with a remotely simmilar means of operation) can output a score for each move. Actually, AG can output 2 scores: win percentage and branches explored.

You can use the relative scores to decide when to overrule the AI. Eg, if move A has a 50.1% win chance with 2k branches explored, and B has a 50.2% chance with 1.9k branches explored, I would go with the opinion of an expert human, as AG thinks the moves are essentially equal.

Self-driving cars is a terrible comparison, especially since the state of the art right now is that the best human drivers far outpace the best AI driving, in both skills and flexibility.

Plus, you only have to outsmart the car AI once to 'win' - e.g. just override one 'drive into the highway barrier' or 'run over that pedestrian' AI mistake.