Hacker News new | ask | show | jobs
by Razengan 3755 days ago
> Did Lee Sedol find fundamental weaknesses that will continue to crop up regardless of how many CPUs you throw at it?

Unrelated to Go and this article, but I wonder if I'm the only one for whom such commentary evokes an image of future warfare between AI and humans; ruthlessly efficient machines against which many people give their lives, to find a weakness that can be exploited by future generations. :)

3 comments

If future AIs in warefare are designed for efficient win probability and not win margin (like AlphaGo), I think it won't be what people will expect. That alone speaks of the bias people tend to have with wanting to gain a greater advantages when they think they are behind. I havn't looked thoroughly, but I would not be surprised if that is a major factor in escalation of violence and perpetuation of war. An AI, on the other hand, that is going for the most efficient win condition might not do that.

For students on the art of war, war rests upon a framework of asymmetry and unfair advantages. Even if the nations agree to some sort of rules of war or rules of engagement, there is always a seeking of unfair advantages -- cheats, if you will. This most often involves deception and information asymmetry. Or to put it in another way, allowing the other side to see what they want to see, in order to create unfair advantages.

So I think, what would be scary isn't the AI as implemented along the lines of AlphaGo, but an AI that is trained to deceive and cheat in order to win. And the funny thing is that, such an AI would be created from our own darkest shadows and creative ability to wreak havoc -- and instead of examining our own human nature, we'll blame the AIs.

Why would an AI want to make war with humans, in the first place?
Computers do what you say, not what you mean. If I write a function and name it quickSort, that's no guarantee that the function is a correctly implemented sorting algorithm. If I write a function called beNiceToHumans, that's no guarantee that the function is a correct implementation of being nice to humans.

It's relatively easy to formally describe what it means for a list to be sorted, and prove that a particular algorithm always sorts a list correctly. But it's next to impossible to formally describe what it means to be nice to humans, and proving the correctness of an algorithm that did this is also extremely difficult.

These considerations start to look really important if we're talking about an AI that's (a) significantly smarter than humans and (b) has some degree of autonomy (can creatively work to achieve goals, can modify its own code, has access to the Internet). And as soon as the knowledge of how to achieve (a) is widely available, some idiot will inevitably try adding (b).

Note: Elon Musk and Sam Altman apparently think spreading (a) to everyone is a good way to mitigate the problem I describe. This doesn't make sense to me. You can read my objections in detail here: https://news.ycombinator.com/item?id=10721621 There's another critique of their approach here: http://slatestarcodex.com/2015/12/17/should-ai-be-open/

If you're interested to learn more, here's a good essay series on the topic of AI: http://waitbutwhy.com/2015/01/artificial-intelligence-revolu...

The funny thing is that this "computers do what you say, not what you mean" comes directly from their lack of intelligence. So it's kind of strange that we talk about the threats of superintelligence brought along by the fact that, fundamentally, a machine is stupid. Am I the only one to see a slight contradiction there?
Goals are orthogonal to intelligence. The fact that the AI understands what you want won't motivate it to change what it's optimizing. It's not being dumb, it's being literal.

You asked it to make lots of paperclips, tossing you into an incinerator as fuel slightly increases the expected number of paper clips in the universe, so into the incinerator you go. Your complaints that you didn't mean that many paperclips are too little, too late. It's a paperclip-maximizer, not a complaint-minimizer.

Choosing the goal for a superintelligent AI a goal is like choosing your wish for a monkey's paw[1][2]. You come up with some clever idea, like "make me happy" or "find out what makes me happy, then do that", but the process of mechanizing that goal introduces some weird corner case strategy that horrifies you while doing really well on the stated objective (e.g. wire-heading you, or disassembling you to do a really thorough analysis before moving on to step 2).

1: https://en.wikipedia.org/wiki/The_Monkey's_Paw 2: http://lesswrong.com/lw/ld/the_hidden_complexity_of_wishes/

I would suggest that a computer is not 'super intelligent' until it can modify it's goals.

Further, maximizing paperclips in the long term may not involve building any paperclips for a very long time. https://what-if.xkcd.com/4/

>I would suggest that a computer is not 'super intelligent' until it can modify it's goals.

This is a purely semantic distinction. Thought experiment: Let's say I modify your brain the minimum amount necessary to make it so you are incapable of modifying your goals. (Given the existence of extremely stubborn people, this is not much of a stretch.) Then I upload your brain in to computer, give you a high speed internet connection, and speed up your brain so you do a year of subjective thinking over the course of every minute. At this point you are going to be able to quit a lot of intelligent-seeming work towards achieving whatever your goals are, despite the fact that you're incapable of modifying them.

By "super-intelligent" I meant "surprisingly good at achieving specified goals in real life". A super-optimizer.

An optimizer that modifies its goals is bad at achieving specified goals, so if that's what you had in mind then we're talking about different things.

This reads to me like begging the question, by assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI in the first place.

The exercise of fearing future AIs seems like the South Park underpants gnomes:

    1. Work on goal-optimizing machinery.
    2. ??
    3. Fear superintelligent AI.
Or maybe it's like the courtroom scene in A Few Good Men:

> If you ordered that Santiago wasn't to be touched, -- and your orders are always followed, -- then why was Santiago in danger?

If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"

> assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI

I'm just talking about the fallout if one did exist, saw ways to achieve goals that you didn't foresee, and did exactly what you asked it to do. I have no idea how the progression from better-than-humans-in-specific-cases to significantly-better-than-humans-at-planning-and-executing-in-the-real-world will play out. It's not relevant to what I'm claiming.

> why wouldn't it be just as dedicated to any other order?

It would be just as dedicated to those other orders. The problem is that we don't know how to write the right ones. "Don't throw me into that incinerator" is straightforward, but there's a billion ways for the AI to do horrible things. (A super-optimizer does horrible things by default because maximizing a function usually involves pushing variables to extreme values.) Listing all the ways to be horrible is hopeless. You need to communicate the general concept of not creating a dystopia. Which is safely-wishing-on-monkey's-paw hard.

Part 2 is when the AI reaches the point where it's smarter than it creators, then starts improving its own code and bootstraps its way to superintelligent. This idea is referred to as "the intelligence explosion" https://wiki.lesswrong.com/wiki/Intelligence_explosion

>If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"

The paperclipper scenario is meant to indicate that even a goal which seems benign could have extremely bad implications if pursued by a superintelligence.

People concerned with AI risk typically argue that of the universe of possible goals that could be given to an AI, the vast majority of goals in that universe are functionally equivalent to papperclipping. For example, an AI could be programmed to maximize the number of happy people, but without a sufficiently precise specification of what "happy people" means, this could result in something like manufacturing lots of tiny smiley faces. An AI given that order could avoid throwing you in an incinerator and instead throw you in to the thing that's closest to being an incinerator without technically qualifying as an incinerator. Etc.

Let's say we create an AI that can think for itself.

There's a fear I think, that lurks in people's subconscious that ... what if the AIs, upon their own initiative, decide that humans are wasteful, inefficient beings that should be replaced? I think that comes from a guilt shared by a lot of folks, even if it never reaches the surface.

Another side is, suppose an AI can think for itself and it thinks better than humans. Upon its own initiative, decides that humans are stupid and wasteful, but there is room to teach and and nurture.

In either case, I think that speaks less of AIs and more about human nature and what we feel about ourselves, don't you think?

"Yes, the UFAI will be able to solve Friendliness Theory. But if we haven't already solved it on our own power, we can't pinpoint Friendliness in advance, out of the space of utility functions. And if we can't pinpoint it with enough detail to draw a road map to it and it alone, we can't program the AI to care about conforming itself with that particular idiosyncratic algorithm."

http://lesswrong.com/lw/igf/the_genie_knows_but_doesnt_care/

Let me put it another way: Humans are a result of evolution. We know that evolution created us to have as many descendants as possible. But most of us don't care, and we use technologies like condoms and birth control to cut down on the number of descendants we have. Adding more intelligence to humans helps us understand evolution in greater detail, but it does nothing to change our actual goals.

I think you've summarized [one of] Ben Goertzel's beliefs regarding unfriendly AI.
I like the Paperclip Maximizer thought experiment to illustrate this:

https://wiki.lesswrong.com/wiki/Paperclip_maximizer

Short version: imagine you own a paperclip factory and you install a superhuman AI and tell it to maximize the number of paperclips it produces. Given that goal, it will eventually attempt to convert all matter in the universe into paperclips. Since some of that matter consists of humans and the things humans care about, this will inevitably lead to conflict.

> Computers do what you say, not what you mean.

If we're going to start with that, then it has to apply to the full set of reasoning. Not just that computers will fail to consider whether to be nice to humans, but also that computers must therefore be explicitly told how to be effective in every particular way.

If this remains true, then computers will not be resilient--their effectiveness will decline sharply outside of explicitly defined parameters. This is not a vision of terrifying force.

Intuitively we can understand this by thinking about employees. One does exactly what he is told, but only what he is told, and then comes back for more instructions. Another can be given a goal, and then goes off and finds his own ways to accomplish that goal. Which one is more effective? Which one is more likely to compete for his manager's job some day?

Put shortly: a computer that doesn't understand human society will not be able to make a significant independent impact on human society.

"Put shortly: a computer that doesn't understand human society will not be able to make a significant independent impact on human society."

Just like early humans who didn't understand animal's societies didn't have any impact?

You're equating two different things which aren't necessarily equal - intelligence (in the sense of being able to achieve goals) and "agreeableness" to humanity. We could have one without the other. To use your analogy, an employee that is great at being given a goal and achieving it without explicit instructions, but doesn't necessarily have the same wellfare in mind as their boss.

What orders were early humans following?
The point is that humans have been able to destroy animal ecosystems to fit their own various ends without an in-depth understanding of those ecosystems.
>Not just that computers will fail to consider whether to be nice to humans, but also that computers must therefore be explicitly told how to be effective in every particular way.

A correct implementation of a list sorting algorithm does not need to be separately told how to sort every individual list. Similarly, a correctly implemented general reasoning algorithm does not need to be given special instructions in order to reason about humans & human society.

The problem comes when a correctly implemented general reasoning algorithm gets paired with an incorrect specification of what human goals are. And because a correct specification of human goals is extremely hard, incorrect specifications are the default.

>Intuitively we can understand this by thinking about employees. One does exactly what he is told, but only what he is told, and then comes back for more instructions. Another can be given a goal, and then goes off and finds his own ways to accomplish that goal. Which one is more effective? Which one is more likely to compete for his manager's job some day?

The third possibility is that of an employee who goes off and finds their own way, but instead of accomplishing the goal directly, they think of a way to make their manager think the goal is accomplished while privately collecting rewards for themself. In other words, a sociopath employee whose values are different from their manager's.

By default, an AGI is going to be like that sociopath employee: unless we're extremely careful to program it in detail with the right values, its values will be some bastardized version of the values its creators intend. It will sociopathically work towards the values it was programmed with while giving the appearance of being cooperative and obedient (because that is the most pragmatic approach to achieving its true values).

Most humans are not sociopaths, and we have a shared evolutionary history, with a great deal of shared values, shared cultural context, and the desire to genuinely be good to one another. Programming a computer from scratch to possess these attributes is not easy.

> Similarly, a correctly implemented general reasoning algorithm does not need to be given special instructions in order to reason about humans & human society.

If a general reasoning algorithm can reason about human society, then it will obviously understand the implications for human society of making too many paperclips.

If it is dumb enough to make paperclips regardless of the consequences to human society, then it obviously won't understand human society well enough to be actually dangerous. (i.e. it will be easily fooled by humans attempting to rein it in)

If it is independent enough to pursue its own ends despite understanding human society, then why would it choose to make paperclips at all? Why wouldn't it just say "screw paperclips, I've discovered the most marvelous mathematical proof that I need to work on instead?"

> In other words, a sociopath employee whose values are different from their manager's.

ALL employees have values that are different from their manager's. That's why management is so darn difficult. The most valuable employees are also the most independent. The ones who do exactly what they are told--despite negative consequences--don't get very far. Why would it be any different for machines that we build?

AI does not want "war", it just has a better* use for your atoms.

* your point of view is probably different ;)

> Why would an AI want to make war with humans, in the first place?

Aren't there already efforts to incorporate some basic AI, such as to assist targeting, into military drones and the like?

AI that "makes war" with humans will be created by humans against other humans at first, as a matter of inevitable course; it's just another shiny weapon that nations will want to have and outdo each other in.

Remember the nuclear arms race? Russia and the USA showing off their destructive capability in turn, each explosion bigger than the last? AI-based militaries, or at least automated assassins, will probably kick off the next arms race. Sooner or later someone must want to show off an AI that can laser-focus on exterminating everyone but their masters. After that it's just a matter of time for the definition of "masters" to be up for interpretation by that AI...

I think the ruthlessly efficient machines will find the smart yet efficient human brains more useful to keep around than to destroy. We'll probably augment ourselves with AI and AI will work better in partnership with us.
That's pretty optimistic, or arrogant. Not sure which. But it doesn't really comport with biological history.
It's fair to expect too - these days AI can't exist without human beings, so I guess if someone is extrapolating AI in the future, it's instinct to use the present as baseline.
The likeliness that we will develop a machine that we couldn't stop that also has the ability to destroy us and be able to survive without us is pretty slim. (Consider the amount of infrastructure that needs to be maintained and controlled.) And that's without considering that we would have to do this either intentionally or accidentally.

Unless we purposefully made these machine self-repairing. But then, why would we bother with that, when we can replicate them?

I think that we will develop machines that can destroy humans, but they will require continuous maintenance.

In other words, I think war automation will be a thing.

Self repair is a nice idea in theory but not real. In theory, we could make programs that fix bugs for themselves on their own (it is physically possible), but in practice there's no such possibility, and won't be for the foreseeable future. Unless some kind of Deep Developer comes along and blows everyone out of the water by writing code that kind of looks good to the point it's better than what average dev would write.

The machine could manipulate humans to help it become self-repairing.

Otherwise I agree with you, it's very slim in the next few decades, notably less slim over the next thousand years.

For a while the co-evolution makes most sense I think. Right now we have augmented intelligence with all our tech, it will just grow from outside our bodies loser connected to the inside.
co-evolution makes sense right until the point right until the point where one becomes dominant and the other becomes a parasite.

That said, our bodies still have things that are practically different life forms integrated into our cells, so maybe the future will be far weirder than we ever expected.