Hacker News new | ask | show | jobs
by awwducks 3755 days ago
My rough summary of the match, informed by the various commentators and random news stories.

Game 1: Lee Sedol does not know what to expect. He plays testing moves early and gets punished, losing the game decisively.

Game 2: Lee Sedol calms down and plays as if he is playing a strong opponent. He plays strong moves waiting for AlphaGo to make a mistake. AlphaGo responds calmly keeping a lead throughout the game.

Game 3: Lee Sedol plans a strategy to attack white from the start, but fails. He valiantly plays to the end, creating an interesting position after the game was decided deep in AlphaGo's territory.

Game 4: Lee Sedol focuses on territory early on, deciding to replicate his late game invasion from the previous game, but on a larger scale earlier in the game. He wins this game with a brilliant play at move 78.

Game 5: The prevailing opinion ahead of the game was that AlphaGo was weak at attacking groups. Lee Sedol crafted an excellent early game to try to exploit that weakness.

Tweet from Hassabis midgame [0]:

    #AlphaGo made a bad mistake early in the game (it didnt know a known tesuji) but now it is trying hard to claw it back... nail-biting.
After a back and forth late middlegame, Myungwan Kim 9p felt there were many missed chances that caused Lee Sedol to ultimately lose the game by resignation in the late endgame behind a few points.

Ultimately, this match was a momentous occasion for both the AI and the go community. My big curiosity is how much more AlphaGo can improve. Did Lee Sedol find fundamental weaknesses that will continue to crop up regardless of how many CPUs you throw at it? How would AlphaGo fare against opponents with different styles? Perhaps Park Jungwhan, a player with a stronger opening game. Or perhaps Ke Jie, the top ranked player in the world [1], given that they'd have access to the game records of Lee Sedol?

I also wonder if the quick succession of these games on an almost back-to-back game schedule played a role in Lee Sedol's loss.

Myungwan Kim felt that if Lee Sedol were to play AlphaGo once more, the game would be a coinflip since AlphaGo is likely stronger, but would never fix its weakness between games.

[0]: https://twitter.com/demishassabis/status/709635140020871168

[1]: http://www.goratings.org/

3 comments

Lee Sedol was also coming directly from playing a tournament against human players. It’s not clear how much he prepared for the Alphago match.

I’d be very curious to see a game between Lee Sedol and Alphago where each was given 4–5 hours of play time, instead of 2 hours each. I suspect Lee Sedol would get more benefit from spending a longer time reading into moves than Alphago could get. Or even a game where the overtime periods were extended to 4–5 minutes.

This last game, Lee spent the whole late middlegame and endgame playing in his 1 minute overtime periods, which doesn’t give much time to carefully compare very complex alternatives.

Yep, I felt the same way. I wonder if the time constraints were optimized for AlphaGo.

One of the things I did want to see was how AlphaGo would fare in a blitz situation (i.e. really short timers).

AlphaGo played 5 informal games with shorter time controls alongside the formal games against Fan Hui (the European champion) back in October. "Time controls for formal games were 1 h main time plus three periods of 30 s byoyomi. Time controls for informal games were three periods of 30 s byoyomi."

The games were played back-to-back (formal, then informal) and AlphaGo won 3-2 in the informal games compared to 5-0 in the formal ones, so I would say worse.

The question is whether Alphago’s architecture starts hitting diminishing returns to extra processing faster than top humans is a significantly different question from whether it scales down to a blitz game worse. (Moreover, the difference between 1h main time + 3x 30s byoyomi vs. only 3x 30s byoyomi is absolutely massive.)

Deepmind engineers have stated that the “cluster” version of Alphago only beats the “single machine” version about 70% of the time. This despite the cluster version using like an order of magnitude more compute resources, presumably able to search several moves deeper in the full search tree.

My impression is that there are some fundamental weaknesses in the (as currently trained and implemented) value network, which Lee Sedol was able to exploit. If this is the case, giving the computer time to cover an extra move or two of search depth might not make a huge difference. Giving Lee Sedol twice as much time, however, would have had a significant impact on several of the games in this series, especially the last game. I strongly suspect that with a few extra minutes per move Lee Sedol would have avoided the poor trades in the late-midgame which cost him the game.

I think the DeepMind team might not even have thought deeply about time control. If we were to express this with the known systems in AlphaGo, how do we express the idea that a surprising move should be given more thought? For example, match 4, move 78 was calculated by AlphaGo as having a probability of being played at 1 in 10,000. Is that something that could trigger a deeper read and use of more time?

Another thing that the commentator was talking about during the the overtime: there would be obvious moves in which Lee Sedol seem to spend a lot of time on. But he was spending most of it thinking of other moves having already decided on what he was going to do. Is that something that could be built into AlphaGo?

Or can we look at how to train a net for time control? Is time control something that has to be wired in?

From what I remember, the time controls were decided by the human, and accepted by the alphago team.
> Lee Sedol focuses on territory early on

I get the feeling that this was AlphaGo's strategy in all the games. Unless Sedol presented a game-ending move it was overwhelmingly likely that AlphaGo would back down and focus elsewhere to extend its territory, by making non-aggressive defensive moves. This makes logical sense. During the early game you need to invoke a crystal ball, where during the endgame you can make informed decisions. This was demonstrated particularly well during game 3 where AlphaGo ran away from fights on numerous occasions - "leave me alone to extend my territory."

I must also commend the commentators, especially Redmond, for being so thoroughly informative in unknown waters.

> Did Lee Sedol find fundamental weaknesses that will continue to crop up regardless of how many CPUs you throw at it?

Unrelated to Go and this article, but I wonder if I'm the only one for whom such commentary evokes an image of future warfare between AI and humans; ruthlessly efficient machines against which many people give their lives, to find a weakness that can be exploited by future generations. :)

If future AIs in warefare are designed for efficient win probability and not win margin (like AlphaGo), I think it won't be what people will expect. That alone speaks of the bias people tend to have with wanting to gain a greater advantages when they think they are behind. I havn't looked thoroughly, but I would not be surprised if that is a major factor in escalation of violence and perpetuation of war. An AI, on the other hand, that is going for the most efficient win condition might not do that.

For students on the art of war, war rests upon a framework of asymmetry and unfair advantages. Even if the nations agree to some sort of rules of war or rules of engagement, there is always a seeking of unfair advantages -- cheats, if you will. This most often involves deception and information asymmetry. Or to put it in another way, allowing the other side to see what they want to see, in order to create unfair advantages.

So I think, what would be scary isn't the AI as implemented along the lines of AlphaGo, but an AI that is trained to deceive and cheat in order to win. And the funny thing is that, such an AI would be created from our own darkest shadows and creative ability to wreak havoc -- and instead of examining our own human nature, we'll blame the AIs.

Why would an AI want to make war with humans, in the first place?
Computers do what you say, not what you mean. If I write a function and name it quickSort, that's no guarantee that the function is a correctly implemented sorting algorithm. If I write a function called beNiceToHumans, that's no guarantee that the function is a correct implementation of being nice to humans.

It's relatively easy to formally describe what it means for a list to be sorted, and prove that a particular algorithm always sorts a list correctly. But it's next to impossible to formally describe what it means to be nice to humans, and proving the correctness of an algorithm that did this is also extremely difficult.

These considerations start to look really important if we're talking about an AI that's (a) significantly smarter than humans and (b) has some degree of autonomy (can creatively work to achieve goals, can modify its own code, has access to the Internet). And as soon as the knowledge of how to achieve (a) is widely available, some idiot will inevitably try adding (b).

Note: Elon Musk and Sam Altman apparently think spreading (a) to everyone is a good way to mitigate the problem I describe. This doesn't make sense to me. You can read my objections in detail here: https://news.ycombinator.com/item?id=10721621 There's another critique of their approach here: http://slatestarcodex.com/2015/12/17/should-ai-be-open/

If you're interested to learn more, here's a good essay series on the topic of AI: http://waitbutwhy.com/2015/01/artificial-intelligence-revolu...

The funny thing is that this "computers do what you say, not what you mean" comes directly from their lack of intelligence. So it's kind of strange that we talk about the threats of superintelligence brought along by the fact that, fundamentally, a machine is stupid. Am I the only one to see a slight contradiction there?
Goals are orthogonal to intelligence. The fact that the AI understands what you want won't motivate it to change what it's optimizing. It's not being dumb, it's being literal.

You asked it to make lots of paperclips, tossing you into an incinerator as fuel slightly increases the expected number of paper clips in the universe, so into the incinerator you go. Your complaints that you didn't mean that many paperclips are too little, too late. It's a paperclip-maximizer, not a complaint-minimizer.

Choosing the goal for a superintelligent AI a goal is like choosing your wish for a monkey's paw[1][2]. You come up with some clever idea, like "make me happy" or "find out what makes me happy, then do that", but the process of mechanizing that goal introduces some weird corner case strategy that horrifies you while doing really well on the stated objective (e.g. wire-heading you, or disassembling you to do a really thorough analysis before moving on to step 2).

1: https://en.wikipedia.org/wiki/The_Monkey's_Paw 2: http://lesswrong.com/lw/ld/the_hidden_complexity_of_wishes/

I would suggest that a computer is not 'super intelligent' until it can modify it's goals.

Further, maximizing paperclips in the long term may not involve building any paperclips for a very long time. https://what-if.xkcd.com/4/

This reads to me like begging the question, by assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI in the first place.

The exercise of fearing future AIs seems like the South Park underpants gnomes:

    1. Work on goal-optimizing machinery.
    2. ??
    3. Fear superintelligent AI.
Or maybe it's like the courtroom scene in A Few Good Men:

> If you ordered that Santiago wasn't to be touched, -- and your orders are always followed, -- then why was Santiago in danger?

If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"

Let's say we create an AI that can think for itself.

There's a fear I think, that lurks in people's subconscious that ... what if the AIs, upon their own initiative, decide that humans are wasteful, inefficient beings that should be replaced? I think that comes from a guilt shared by a lot of folks, even if it never reaches the surface.

Another side is, suppose an AI can think for itself and it thinks better than humans. Upon its own initiative, decides that humans are stupid and wasteful, but there is room to teach and and nurture.

In either case, I think that speaks less of AIs and more about human nature and what we feel about ourselves, don't you think?

"Yes, the UFAI will be able to solve Friendliness Theory. But if we haven't already solved it on our own power, we can't pinpoint Friendliness in advance, out of the space of utility functions. And if we can't pinpoint it with enough detail to draw a road map to it and it alone, we can't program the AI to care about conforming itself with that particular idiosyncratic algorithm."

http://lesswrong.com/lw/igf/the_genie_knows_but_doesnt_care/

Let me put it another way: Humans are a result of evolution. We know that evolution created us to have as many descendants as possible. But most of us don't care, and we use technologies like condoms and birth control to cut down on the number of descendants we have. Adding more intelligence to humans helps us understand evolution in greater detail, but it does nothing to change our actual goals.

I think you've summarized [one of] Ben Goertzel's beliefs regarding unfriendly AI.
I like the Paperclip Maximizer thought experiment to illustrate this:

https://wiki.lesswrong.com/wiki/Paperclip_maximizer

Short version: imagine you own a paperclip factory and you install a superhuman AI and tell it to maximize the number of paperclips it produces. Given that goal, it will eventually attempt to convert all matter in the universe into paperclips. Since some of that matter consists of humans and the things humans care about, this will inevitably lead to conflict.

> Computers do what you say, not what you mean.

If we're going to start with that, then it has to apply to the full set of reasoning. Not just that computers will fail to consider whether to be nice to humans, but also that computers must therefore be explicitly told how to be effective in every particular way.

If this remains true, then computers will not be resilient--their effectiveness will decline sharply outside of explicitly defined parameters. This is not a vision of terrifying force.

Intuitively we can understand this by thinking about employees. One does exactly what he is told, but only what he is told, and then comes back for more instructions. Another can be given a goal, and then goes off and finds his own ways to accomplish that goal. Which one is more effective? Which one is more likely to compete for his manager's job some day?

Put shortly: a computer that doesn't understand human society will not be able to make a significant independent impact on human society.

"Put shortly: a computer that doesn't understand human society will not be able to make a significant independent impact on human society."

Just like early humans who didn't understand animal's societies didn't have any impact?

You're equating two different things which aren't necessarily equal - intelligence (in the sense of being able to achieve goals) and "agreeableness" to humanity. We could have one without the other. To use your analogy, an employee that is great at being given a goal and achieving it without explicit instructions, but doesn't necessarily have the same wellfare in mind as their boss.

What orders were early humans following?
>Not just that computers will fail to consider whether to be nice to humans, but also that computers must therefore be explicitly told how to be effective in every particular way.

A correct implementation of a list sorting algorithm does not need to be separately told how to sort every individual list. Similarly, a correctly implemented general reasoning algorithm does not need to be given special instructions in order to reason about humans & human society.

The problem comes when a correctly implemented general reasoning algorithm gets paired with an incorrect specification of what human goals are. And because a correct specification of human goals is extremely hard, incorrect specifications are the default.

>Intuitively we can understand this by thinking about employees. One does exactly what he is told, but only what he is told, and then comes back for more instructions. Another can be given a goal, and then goes off and finds his own ways to accomplish that goal. Which one is more effective? Which one is more likely to compete for his manager's job some day?

The third possibility is that of an employee who goes off and finds their own way, but instead of accomplishing the goal directly, they think of a way to make their manager think the goal is accomplished while privately collecting rewards for themself. In other words, a sociopath employee whose values are different from their manager's.

By default, an AGI is going to be like that sociopath employee: unless we're extremely careful to program it in detail with the right values, its values will be some bastardized version of the values its creators intend. It will sociopathically work towards the values it was programmed with while giving the appearance of being cooperative and obedient (because that is the most pragmatic approach to achieving its true values).

Most humans are not sociopaths, and we have a shared evolutionary history, with a great deal of shared values, shared cultural context, and the desire to genuinely be good to one another. Programming a computer from scratch to possess these attributes is not easy.

> Similarly, a correctly implemented general reasoning algorithm does not need to be given special instructions in order to reason about humans & human society.

If a general reasoning algorithm can reason about human society, then it will obviously understand the implications for human society of making too many paperclips.

If it is dumb enough to make paperclips regardless of the consequences to human society, then it obviously won't understand human society well enough to be actually dangerous. (i.e. it will be easily fooled by humans attempting to rein it in)

If it is independent enough to pursue its own ends despite understanding human society, then why would it choose to make paperclips at all? Why wouldn't it just say "screw paperclips, I've discovered the most marvelous mathematical proof that I need to work on instead?"

> In other words, a sociopath employee whose values are different from their manager's.

ALL employees have values that are different from their manager's. That's why management is so darn difficult. The most valuable employees are also the most independent. The ones who do exactly what they are told--despite negative consequences--don't get very far. Why would it be any different for machines that we build?

AI does not want "war", it just has a better* use for your atoms.

* your point of view is probably different ;)

> Why would an AI want to make war with humans, in the first place?

Aren't there already efforts to incorporate some basic AI, such as to assist targeting, into military drones and the like?

AI that "makes war" with humans will be created by humans against other humans at first, as a matter of inevitable course; it's just another shiny weapon that nations will want to have and outdo each other in.

Remember the nuclear arms race? Russia and the USA showing off their destructive capability in turn, each explosion bigger than the last? AI-based militaries, or at least automated assassins, will probably kick off the next arms race. Sooner or later someone must want to show off an AI that can laser-focus on exterminating everyone but their masters. After that it's just a matter of time for the definition of "masters" to be up for interpretation by that AI...

I think the ruthlessly efficient machines will find the smart yet efficient human brains more useful to keep around than to destroy. We'll probably augment ourselves with AI and AI will work better in partnership with us.
That's pretty optimistic, or arrogant. Not sure which. But it doesn't really comport with biological history.
It's fair to expect too - these days AI can't exist without human beings, so I guess if someone is extrapolating AI in the future, it's instinct to use the present as baseline.
The likeliness that we will develop a machine that we couldn't stop that also has the ability to destroy us and be able to survive without us is pretty slim. (Consider the amount of infrastructure that needs to be maintained and controlled.) And that's without considering that we would have to do this either intentionally or accidentally.

Unless we purposefully made these machine self-repairing. But then, why would we bother with that, when we can replicate them?

I think that we will develop machines that can destroy humans, but they will require continuous maintenance.

In other words, I think war automation will be a thing.

Self repair is a nice idea in theory but not real. In theory, we could make programs that fix bugs for themselves on their own (it is physically possible), but in practice there's no such possibility, and won't be for the foreseeable future. Unless some kind of Deep Developer comes along and blows everyone out of the water by writing code that kind of looks good to the point it's better than what average dev would write.

The machine could manipulate humans to help it become self-repairing.

Otherwise I agree with you, it's very slim in the next few decades, notably less slim over the next thousand years.

For a while the co-evolution makes most sense I think. Right now we have augmented intelligence with all our tech, it will just grow from outside our bodies loser connected to the inside.
co-evolution makes sense right until the point right until the point where one becomes dominant and the other becomes a parasite.

That said, our bodies still have things that are practically different life forms integrated into our cells, so maybe the future will be far weirder than we ever expected.