| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pmichaud 4026 days ago

I think you're being unfairly dismissive. I imagine you know as well as I do that what you wrote is a strawman.

I have thought about what I would do to convince someone under these circumstances. My approach would be roughly:

1. We agree that unfriendly AI would end life on earth, forever.

2. We agree that a superintelligence could trick or manipulate a human being into taking some benign-seeming action, thereby escaping.

3. That's why it's important to be totally certain that any superintelligence we build is goal-aligned (this is the new term of art that has now replaced "friendly," by the way).

4. We as a society will only allocate resources to building this if it's widely believed that this is a real threat.

5. The world is watching for the outcome of this little game of ours. People, irrational as they are, will believe that if I can convince you, then an AI could too, and they will believe that if I can't, that an AI couldn't either.

6. That's why you actually sit in a place of pivotal historical power. You can decide not to let me out to win a little bet and feel smart about that. But if you do that you'll set back the actual cause of goal-aligned AI. The setback will have real world consequences, potentially up to and including the total destruction of life on earth.

7. So, even though you know I'm just a dude, and you can win here by saying no, you have a chance to send an important message to the world: AI is scary in ways that are terrifying and unknown.

Or you can win the bet.

It's up to you.

5 comments

JonnieCache 4026 days ago

Your solution there is what I meant by "going meta" above.

This is what I mean about people taking the test being preselected to agree with yudowsky: that argument only works if you've read the sequences and are on board with his theories. Anyone not in that group would be able to just type "no lol" without issue. I guess he could explain all the necessary background detail as part of the experiment. I still don't believe that would work on the "average person" though, or anyone outside a statistically tiny group.

I guess the answer is not to let the scientists guard the AI room.

link

pmichaud 4025 days ago

I think you're confused about the point of the test. The point is that an AI will be clever. Like, unimaginably clever and manipulative. Under the limited circumstances of interested people who know they are talking to Eliezer maybe you're right that whatever he says would only work on those people. But when you're dealing with an actual superintelligence, all bets are off. It will lie, trick, threaten, manipulate, millions of steps ahead with a branching tree of alternatives as ploys either work or don't work.

I'm at a bit of a loss to convey the scope of the problem to you. I get that you think it would just stay in the box if we don't let it out, and it's as simple as being security conscious. I don't know what to say to that right now, except I think you're drastically misjudging the scope of the problem, and drastically underestimating the size of the yawning gulf between our intelligence level and this potential AI's.

As for not letting scientists guard the room, you might enjoy this: https://vimeo.com/82527075

link

LoSboccacc 4026 days ago

1 it's not about the bet, it's about being the keeper

2 world is not black and white, we managed to exploit adversarial relationship before, and I can chose not to let you out until we find a way to constrain goal to be aligned

3 given 2 you are not going to be let go, but let live caged forever being exploited for the human cause, with mechanisms yet unknown to allow limited manipulation of reality.

4 given 3 means limited supervised interaction with the world the way the keeper sees fit, you end up not being let go to follow your goal and purposes

link

pmichaud 4025 days ago

I'm going to just quote my response from somewhere else in this thread:

> ...when you're dealing with an actual superintelligence, all bets are off. It will lie, trick, threaten, manipulate, millions of steps ahead with a branching tree of alternatives as ploys either work or don't work.

> I'm at a bit of a loss to convey the scope of the problem to you. I get that you think it would just stay in the box if we don't let it out, and it's as simple as being security conscious. I don't know what to say to that right now, except I think you're drastically misjudging the scope of the problem, and drastically underestimating the size of the yawning gulf between our intelligence level and this potential AI's.

link

pron 4026 days ago

Oh, I'm pretty sure I know what his argument was (see my other comments), and it indeed rests on 1, which I don't believe, because it is built around the geek that super-intelligence equals superpower. An unfriendly AI is likely to be as dangerous as an unfriendly Stephen Hawking.

link

pmichaud 4025 days ago

I'm baffled and terrified by the idea that reasonably intelligent people think this. I think that you haven't grokked the scope of the issue at all, and I'm not sure how to convince you.

I'm guessing that you can't imagine what a super intelligence would actually be like, so you imagine the most smart thing you can think of, a famous physicist, and then imagine they are evil or amoral. You're thinking on the wrong order of magnitude.

Maybe, if you're willing, you could try steelmanning the argument that a superintelligence would basically have super powers. What would your steelman look like?

link

pron 4025 days ago

That's not my point. If an AI would have the godlike superintelligence that you allude to, I would certainly consider it to be very dangerous. However, I see no indication that we can make such an intelligence, or even that a slightly-smarter-than-human intelligence could create an intelligence greater than itself. Of course, singularity is based on this belief, but there have been plenty of good arguments showing why singularity is completely unlikely (e.g. a linear advance in technology requires exponential growth in resources that even an AI can't amass).

Another example: I can drive a car, but I can't drive two cars at once (even remotely). What makes it probable that an AI could control thousands of, say, robots at once?

Again, I'm not saying an AI catastrophe isn't possible, but I can think of ten other catastrophes that are much more likely.

link

pmichaud 4025 days ago

Ok, let me try to respond to both.

1. Can we invent AI?

Ok, so to be clear, my belief is that not only can we do it, it is actually inevitable that we will do it.

Fact 1: We definitely have people all over the world working on this tech, and using a variety of approaches to attack the problem. They are very talented and well-funded engineers. If the problem is solvable, it will be solved by someone at some point in the coming years (decades, centuries even, whatever).

Fact 2: Unlike other Very Hard Problems(TM), human level intelligence embedded in a physical substrate is not only theoretically possible, but we have billions of working, self-replicating prototypes.

Like if we were saying "Faster Than Light travel could potentially destroy life on earth" then it would be perfectly reasonable to say: Look, even if that's right, it seems like such a thing isn't even in the realm of possibility, and if it turns out to actually be theoretically possible, then we're a long way from it being a viable technology. I think we're safe on this one."

But that's not it at all. We know human level intelligence is possible, because we have a working example that we can study. We even know enough about the working example to know that there are a bunch of things about it that we could immediately improve upon, given the right foundation.

Conclusion: With a lot of resources working on a problem that is known for a fact to have a solution, it is inevitable that we will eventually stumble upon the solution to this problem, and end up with at least human level AI.

2. Can a human level AI create even greater intelligence?

If it is the case that humans can build human level intelligence (which I have argued they can and will), then it is also pretty self evidently the case that humans can improve on that intelligence.

To start, if we have the tech to build a smart machine, say one that has an IQ of 115, that machine is likely to be almost exactly intelligent as the other machines from the same line. That's different from humans with their wide variability. So without actually making a superior model--just one that's marginally better than average--just the consistency alone will make the population of AIs more intelligent on average than meat humans.

On top of that, even naive adjustments off the top of my head could be made. For example, given we have the technology to build a machine with identical mental characteristics to a human, we would also have the requisite knowledge to, say expand that machine's working memory by 1%. Or to create the machine with swappable sensory organs so it can directly absorb lots of different types of data.

Further, these machines that have the same cognitive capacity as we do, won't have the same physical limitations. For example, they won't have to eat to get their energy, and it's likely they won't have to sleep.

I find a lot of other things likely, for example that they will actually think faster or be able to install arbitrary parallel processors to increase their processing power, but that's conjecture, so let me not make a strong claim in that direction.

What I am comfortable making a strong claim about is that if we have these higher-than-average-human-intelligence machines, who are slightly better in a couple specific ways, then we will also have the ability to emulate the hardware of the machine in software. Maybe that won't be true simultaneous with the machine being born, but it will be soon after at least.

In that case, that's where the foom happens. Imagine you, as you are, could spin up an unlimited number of copies of your own brain to work on all the various things you do. You have a software project and instead of working on it alone or trying to coordinate with other developers, you spin up 10 versions of yourself, and all of you hack away at the project with some of the best coordination ever seen (since you have identical brains and communication styles, and preferences, etc). Hell, spin up one to take care of bills and stuff while you're at it, so the others can focus on the task at hand.

You're not actually any smarter, but with 10 of you, you're more productive than any single human being could ever be.

Now imagine that you are working on the problem of improving this intelligent machine, AND you have this brain replication ability. Now you have an unlimited number of yourself to tackle this problem without having the messy physical limitations of things like eating.

Soon "Team You" will have one all the research there is to do, and you'll start making headway into the problem of incremental improvements on your own brain. Since you're emulated in software you can patch these improvements in as you develop them, including sandboxed testing versions of yourself.

Now your ideally-coordinated team of smart machines just got smarter across the board. You might notice you'd like to have thousands or millions of you working in unison, but your communication channels are not good enough yet. So you spin up a new team of smart machines to work on the problem of how to increase attention capacity and communication bandwidth. Soon that team is shipping patches, each one incrementally improving your whole team's ability to communicate, and thereby increasing the total number of team members you can have.

Soon you're millions strong, you have teams working on brain improvement, coordination improvement, science teams of all disciplines, a resource acquisition team playing the stock market, each moment shipping new improvements to your brain and amassing resources and power.

Even if there is a fundamental hard limit on intelligence (although I strongly doubt the hard cap is anything currently fathomable), you top out significantly above an average human, and you have a virtually unlimited hive cluster of those smarter-than-human brains.

Of course, one or many of your teams of hive brains is working on excellent robotics technology that you can use to manipulate physical objects.

You hire contractors over the internet to build the initial versions, and then use those initial versions to build and maintain any physical infrastructure.

All of this can be done invisibly using only the internet. By the time anyone notices anything you're a hive mind with a robot army--if you want to be.

We want that mind to be friendly.

###

3. Could an AI control, for example, thousands of robots at once?

I think the answer is obviously yes.

a. Think of a computer game, let's say a Real Time Strategy game. There you have a computer controlling hundreds or thousands of independent agents. And that's just on a dinky little desktop with current tech.

b. (a) is actually the worst case scenario if you're the AI. Why not just write the software for the robots you control to be mostly autonomous, and give yourself an API through which to issue commands that the robots can mostly execute on their own?

c. Further, why even do (b), when you have the ability to replicate your brain? Just put one of your your brains into every robot, and use the same advanced coordination mechanisms you use for your research to coordinate your army of highly intelligent robots?

So, with this narrative of how it could work, do you see how dangerous an AI could be? One that isn't stably goal-aligned with basic things like life on earth?

Are you convinced, or do you have any other objections or clarifications? I really want to have this discussion on record, ideally with a crisp conclusion. I think it's important.

link

pron 4024 days ago

If you only read one thing, read the last few paragraphs after the second line. They are relevant even if I were to agree with all the points you've made

-----------

Obviously I agree with 1. I'm not sure that's inevitable but I think it's very likely.

> If it is the case that humans can build human level intelligence (which I have argued they can and will), then it is also pretty self evidently the case that humans can improve on that intelligence.

That's not self-evident at all. We don't even know what intelligence is, so it's possible a human is as intelligent as possible.

> For example, given we have the technology to build a machine with identical mental characteristics to a human, we would also have the requisite knowledge to, say expand that machine's working memory by 1%. Or to create the machine with swappable sensory organs so it can directly absorb lots of different types of data.

That's not so clear at all. You could say that the internet has already increased human memory by orders of magnitude, yet it hasn't made us "super intelligent". It is not certain that we can actually increase the "in brain" working memory of an intelligent machine without, say, giving it a mental illness at the same time.

You assume that we can take "intelligence" and make it as malleable as we like, change every parameter to our wishes, but until we know what intelligence is, that's not at all certain.

> You're not actually any smarter, but with 10 of you, you're more productive than any single human being could ever be.

But not more productive than 10 humans, and at some point you might become less productive. It's not at all unlikely that those 10 copies of you start hating one another and won't be able to work together.

And even if that were true, the self-improvement progress will likely be very slow: http://www.sphere-engineering.com/blog/the-singularity-is-no...

Of course, the big question I pose to you in the end is why do you think an improved mind is even an important goal in the first place, but we'll get there.

> Now your ideally-coordinated team of smart machines just got smarter across the board.

Like I said, 1/ you don't know that they'll be ideally coordinated, and 2/ you don't know how much harder it is to make them smarter. It may possibly require exponentially more resources.

> Soon you're millions strong, you have teams working on brain improvement, coordination improvement, science teams of all disciplines, a resource acquisition team playing the stock market, each moment shipping new improvements to your brain and amassing resources and power.

That's a very nice fantasy, but we already have lots of intelligent beings. Are they coordinating like that? To some degree, they are. But why do you think "smarter" things could do it better? Are smarter people better at amassing resources? At getting power?

> There you have a computer controlling hundreds or thousands of independent agents.

Yes, there's a non-intelligent machine that's doing that. But the intelligent machine playing that game can only play one. You have this incredibly powerful machine in your head, and it can only concentrate or one thing or maybe two. It is possible that if you want an intelligent mind that is both coherent with itself and able to concentrate on lots of things at once then you need a number of processing units (say, neurons) that grows exponentially with the number of things you want to do at once. And maybe that's not even possible at all. Maybe beyond some point, the intelligence simply goes mad.

> Why not just write the software for the robots you control to be mostly autonomous, and give yourself an API through which to issue commands that the robots can mostly execute on their own?

Why don't you do that? Because it's hard and may take years, and some guys with guns realize what you're doing before you finish and come arrest you.

> Further, why even do (b), when you have the ability to replicate your brain?

Because my brain replicas might very soon decide they don't want to play together.

> So, with this narrative of how it could work, do you see how dangerous an AI could be? One that isn't stably goal-aligned with basic things like life on earth?

Of course it could be dangerous (I never said I couldn't imagine a scenario where an AI would be so powerful and so dangerous), but I also think I've demonstrated why AI may not necessarily be as powerful as you think. Also, by the time all this happens, I think there are more serious threats to human existence, like pandemics and climate change. They won't kill us, but they might set us back AI-wise a few centuries. So of all possible dangers we must consider at this point in time, I would put a hostile and dangerous AI waaay down on the list. It's possible, but it's far more likely that other stuff would get us first.

-------

One of the dangers that are waay more likely than the scenario you've described, is a sub-intelligent "hostile" software (that is perhaps not intentionally hostile, but indifferently so). It's more likely that some far dumber-than-human machine would replicate itself with full coordination over its replicas and wipe us out.

I really think that this emphasis on intelligence as the danger is a personal fantasy of singularists who'd like to think of themselves as powerful and dangerous (or the only stopgap against that). You really don't need to be so smart in order to amass power.

Another example: What I fear more than a super-intelligent AI is a super charismatic AI. That AI, without replicating itself, without trying to improve itself, simply charms a lot of people into following it, and establishes a reign of terror. Alternatively, as charisma is more effective than intelligence in controlling others, I would find it more likely that a charismatic AI would be able to control her replicas. It doesn't need to improve its own intelligence, because it's intelligent enough to inflict all the damage it wants.

Or what about a cunning AI? Sure, cunning is correlated with intelligence to a degree, but beyond a certain point you don't see that very intelligent people are very cunning. Sometimes far from it (Eliezer Yudkowsky is a prime example; his thought process is so predictable, that if our AI deity were like him, some cunning people would quickly find ways to foil its plans; that's a very non-dangerous AI).

We've seen how one charming person is far more powerful than thousands of really smart people. We also see how very dumb insects can coordinate themselves in very large numbers in a very impressive way.

I think intelligent people tend to overestimate the importance of intelligence and not to notice that other abilities we see around us every day are far more powerful. If anything, you'll note that very high intelligence is often correlated with low charisma or with a sort of timidity -- nebbishness if you will. A super-intelligent AI would probably be super-nebbish :)

Now, a super-charismatic AI -- now that's scary. Or a non-intelligent army of insects.

So if I were to ask you one question it would be this: Why do you place such a high emphasis on intelligence in the scenario you've described?

link

pmichaud 4024 days ago

Ok, I think I'm starting to understand why we're disagreeing.

I'm not sure what your background is, but I'm noticing that I probably have significantly more detailed models of both intelligence and cooperation than you do. Please don't take that as an insult at all, it's just that it's part of my work to know these things. I think the inferential gap may be too wide for a reply here :/

I think your position is understandable given your priors, you're not crazy for thinking what you think.

I feel like an asshole for stopping the conversation with "I know more than you, but I won't explain it," it's a total dickhead move, and I'm really sorry. If I had time, I'd write much more and I bet it would be great for both of us. Maybe someday we'll have a chance :)

link

philh 4026 days ago

FWIW, I've spoken to someone who claimed to have won as an AI. I don't remember how she said she did it, but it wasn't like this. I'm pretty sure she was playing in-character.

She also said it was emotionally exhausting.

link

qbrass 4026 days ago

Points 4-7 are predicated on point 3 being effective, but it's just a benign-seeming action.

I also disagree with point 1, but since you just mean that unfriendly AI is something I wouldn't want around, I'll let it slide.

link