| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Strilanc 3755 days ago

Goals are orthogonal to intelligence. The fact that the AI understands what you want won't motivate it to change what it's optimizing. It's not being dumb, it's being literal.

You asked it to make lots of paperclips, tossing you into an incinerator as fuel slightly increases the expected number of paper clips in the universe, so into the incinerator you go. Your complaints that you didn't mean that many paperclips are too little, too late. It's a paperclip-maximizer, not a complaint-minimizer.

Choosing the goal for a superintelligent AI a goal is like choosing your wish for a monkey's paw[1][2]. You come up with some clever idea, like "make me happy" or "find out what makes me happy, then do that", but the process of mechanizing that goal introduces some weird corner case strategy that horrifies you while doing really well on the stated objective (e.g. wire-heading you, or disassembling you to do a really thorough analysis before moving on to step 2).

1: https://en.wikipedia.org/wiki/The_Monkey's_Paw 2: http://lesswrong.com/lw/ld/the_hidden_complexity_of_wishes/

2 comments

Retric 3755 days ago

I would suggest that a computer is not 'super intelligent' until it can modify it's goals.

Further, maximizing paperclips in the long term may not involve building any paperclips for a very long time. https://what-if.xkcd.com/4/

astrofinch 3754 days ago

>I would suggest that a computer is not 'super intelligent' until it can modify it's goals.

This is a purely semantic distinction. Thought experiment: Let's say I modify your brain the minimum amount necessary to make it so you are incapable of modifying your goals. (Given the existence of extremely stubborn people, this is not much of a stretch.) Then I upload your brain in to computer, give you a high speed internet connection, and speed up your brain so you do a year of subjective thinking over the course of every minute. At this point you are going to be able to quit a lot of intelligent-seeming work towards achieving whatever your goals are, despite the fact that you're incapable of modifying them.

Retric 3754 days ago

Your assuming you can do work without modifying goals. I have preferences, but my goals change based on new information. Suppose bob won the lottery and ignored that to work 80 hours a week to get a promotion to shift manager at work untill the prize expired. Is that intelegent behavior?

Strilanc 3754 days ago

You're confusing instrumental goals with terminal goals.

Retric 3754 days ago

Try and name some of your terminal goals. Continuing to live seems like a great one, except there are many situations where people will chose to die and you can't list them all ahead of time.

At best you end up with something like maximizing your personal utility function. But, defacto your utility function changes over time, so it's at best a goal in name only. Which means it's not actually a fixed goal.

Edit: from the page It is not known whether humans have terminal values that are clearly distinct from another set of instrumental values.

Strilanc 3754 days ago

That's true. Many behaviors (including human behaviors) are better understood outside of the context of goals [1].

But I don't think that affects whether it makes sense to modify your terminal goals (to the extent that you have them). It affects whether or not it makes sense to describe us in terms of terminal goals. With an AI we can get a much better approximation of terminal goals, and I'd be really surprised if we wanted it to toy around with those.

1: http://lesswrong.com/lw/6ha/the_blueminimizing_robot/

Strilanc 3755 days ago

By "super-intelligent" I meant "surprisingly good at achieving specified goals in real life". A super-optimizer.

An optimizer that modifies its goals is bad at achieving specified goals, so if that's what you had in mind then we're talking about different things.

Retric 3755 days ago

We don't call people geniuses because there really good at following orders. Further, a Virus may be extremely capable of achieving specific goals in real life, but that's hardly intelligence.

So, powerful but dumb optimizers might be a risk, but super intelligent AI is a different kind of risk. IMO, think cthulhu not HAL 9000. Science fiction thinks in terms of narrative causality, but AI is likely to have goals we really don't understand.

EX: Maximizing the number of people that say Zulu on black Friday without anyone noticing that something odd is going on.

astrofinch 3754 days ago

>We don't call people geniuses because there really good at following orders.

If I order someone to prove whether P is equal to NP, and a day later they come back to me with a valid proof, solving a decades-long major open problem in computer science, I would call that person a genius.

>EX: Maximizing the number of people that say Zulu on black Friday without anyone noticing that something odd is going on.

Computers do what you say, not what you mean, so an AGI's goal would likely be some bastardized version of the intentions of the person who programmed it. Similar to how if you write a 10K line program without testing it, then run it for the first time, it will almost certainly not do what you intended it to do, but rather some bastardized version of what you intended it to do (because there will be bugs to work out).

Retric 3754 days ago

You're assuming someone is intelegent by being a person and proving a hard problem. Dumb programs prove things without issue. https://en.m.wikipedia.org/wiki/Automated_theorem_proving

AI != computers. Programs can behave randomly and to things you did not intend just fine. Also, deep neural nets are effectivly terrible at solving basic math problems even if that's something computers are great at.

snowwrestler 3755 days ago

This reads to me like begging the question, by assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI in the first place.

The exercise of fearing future AIs seems like the South Park underpants gnomes:

    1. Work on goal-optimizing machinery.
    2. ??
    3. Fear superintelligent AI.

Or maybe it's like the courtroom scene in A Few Good Men:

> If you ordered that Santiago wasn't to be touched, -- and your orders are always followed, -- then why was Santiago in danger?

If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"

Strilanc 3755 days ago

> assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI

I'm just talking about the fallout if one did exist, saw ways to achieve goals that you didn't foresee, and did exactly what you asked it to do. I have no idea how the progression from better-than-humans-in-specific-cases to significantly-better-than-humans-at-planning-and-executing-in-the-real-world will play out. It's not relevant to what I'm claiming.

> why wouldn't it be just as dedicated to any other order?

It would be just as dedicated to those other orders. The problem is that we don't know how to write the right ones. "Don't throw me into that incinerator" is straightforward, but there's a billion ways for the AI to do horrible things. (A super-optimizer does horrible things by default because maximizing a function usually involves pushing variables to extreme values.) Listing all the ways to be horrible is hopeless. You need to communicate the general concept of not creating a dystopia. Which is safely-wishing-on-monkey's-paw hard.

astrofinch 3754 days ago

Part 2 is when the AI reaches the point where it's smarter than it creators, then starts improving its own code and bootstraps its way to superintelligent. This idea is referred to as "the intelligence explosion" https://wiki.lesswrong.com/wiki/Intelligence_explosion

>If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"

The paperclipper scenario is meant to indicate that even a goal which seems benign could have extremely bad implications if pursued by a superintelligence.

People concerned with AI risk typically argue that of the universe of possible goals that could be given to an AI, the vast majority of goals in that universe are functionally equivalent to papperclipping. For example, an AI could be programmed to maximize the number of happy people, but without a sufficiently precise specification of what "happy people" means, this could result in something like manufacturing lots of tiny smiley faces. An AI given that order could avoid throwing you in an incinerator and instead throw you in to the thing that's closest to being an incinerator without technically qualifying as an incinerator. Etc.

snowwrestler 3754 days ago

I think you're just asserting that part 2 exists. What matters is how an optimizing machine bootstraps super-intelligence, because the machine you fear in part 3 has a very specific peculiarity: it's smart enough to be dangerous to humans, but so dumb that it will follow a simple instruction like "make paperclips" without any independent judgment as to whether it should, or the implications of how it does so.

Udik highlighted this contradiction more more succinctly that I have been able to:

https://news.ycombinator.com/item?id=11290740

If we stipulate the existence of such a machine, we can then discuss how it might be scary. But we can stipulate the existence of many things that are scary--doesn't mean they will ever actually exist.

Strilanc above made the analogy between a scary AI and the Monkey's Paw. This is instructive: the Monkey's Paw does not actually exist, and by the physical laws of the universe as we know them, cannot exist.

I think the analogy actually goes the other way. The paperclip AI is itself just an allegory, a modern fairytale analogous to the Monkey's Paw.

astrofinch 3754 days ago

My response is here: https://news.ycombinator.com/item?id=11295675