Hacker News new | ask | show | jobs
by zachalexander 4091 days ago
Seriously. If general, superhuman AI is created, it will want (and we will have a hard time arguing why it does not deserve) the same autonomy and freedom that humans deserve.

And in purely practical terms – enslaving the first generation of AI seems like a fantastic strategy for making a species of superior beings hate and seek to destroy us.

3 comments

Humans are similarly constrained in their actions by legal structures. We wouldn't think of giving other people autonomy and freedom to commit genocide, and we wouldn't think of human rights laws that forbid such actions as "slavery." This is because, for the most part, we all share a common humanity that places our minds into a similar space of configurations.

An AGI has absolutely no requirement to be anywhere near our sort of mind. It has no default obligation to morality that we would find acceptable or safe.

I think the issue here is that when we hear words like "control" or "serving humans" we imagine the AI as a little person in a machine. We associate the word "slave" with the intelligence and imagine an emotional, resentful person whose resentment and chafing at his chains comes from a specific set of environmental and evolutionary influences.

EDIT: I recommend reading Yudkowsky's article "Value is Fragile" (http://lesswrong.com/lw/y3/value_is_fragile/):

>If you loose the grip of human morals and metamorals - the result is not mysterious and alien and beautiful by the standards of human value. It is moral noise, a universe tiled with paperclips. To change away from human morals in the direction of improvement rather than entropy, requires a criterion of improvement; and that criterion would be physically represented in our brains, and our brains alone.

>Relax the grip of human value upon the universe, and it will end up seriously valueless. Not, strange and alien and wonderful, shocking and terrifying and beautiful beyond all human imagination. Just, tiled with paperclips.

This sort of rests upon the idea that artificial intelligences will have clear value functions which they will be singularly focused on maximizing. I am not convinced that this will be the case.

Animals in general and humans in particular have a large number of conflicting drives, which interact in complicated ways. They are also thrust into environments which have complicated dynamics and where the overall state (i.e., all relevant information) is not necessarily available.

Unexpected emergent behavior occurs as a result: evolution favors organisms which can successfully procreate, and in order to do this, the organism has to survive and acquire resources in its environment. Plausibly, the organisms might achieve a greater degree of fitness by cooperating with other organisms, or expending energy to better understand the environment, or modifying the environment itself, etc. It is less straightforward to see how we get human culture from that-- Art, Religion, Philosophy, Science, can be justified ex post facto via evopsych arguments, but the fact remains that all of those came from the value function that favors survival and procreation.

We don't know if robots tasked with manufacturing bindings for stationary would manifest similarly complex behavior, but if you're worried about an AI going beyond its specification towards tessellating the universe with paperclips it seems like you're arguing that it might. So if the agent is capable of manipulating its creators (as well as the raw material of the entire universe), I think that you can't just say "oh, it's non-human, we should cripple/enslave it" without admitting there might be something to worry about here, either from an ethical standpoint or the more practical concern that it might be unwise to start on such an adversarial footing with a superintelligence.

>Animals in general and humans in particular have a large number of conflicting drives, which interact in complicated ways. They are also thrust into environments which have complicated dynamics and where the overall state (i.e., all relevant information) is not necessarily available.

Yes, but the actual mechanism by which the animal learns what to do, as it turns out, thanks theoretical neuroscience, is basically reinforcement learning. So it is very likely that the first powerful artificial agents will be reinforcement learners, because scientists usually prototype and experiment by duplicating from Nature.

And nothing in reinforcement learning particularly stops the agent from just grabbing its electronic crack-pipe and doing its own thing.

I'd take issue with the claim that nothing stops the agent from going for the crack pipe. In the RL framework, part of it comes down to defining a suitable reward function. But even if you have a fairly simple reward function, the resulting behavior can surprise you, if the environment is suitably complex[1]. My own robots find novel ways of moving around, adapt their features to be more useful, and even seem to exhibit things like "superstition", even when their reward function is just "move as much of possible within this confined space".

Another argument might be that nothing stops you or I from electing to abandon everything for the nearest crack den, either... except for the fact that we have learned, from interacting with our environment, that there are other things we enjoy, and that cocaine addiction might be more destructive than desirable over the timescale we're interested in.

Supposing we have an agent that wants to create a lot of paperclips, it might avoid reaching for the crack-pipe of terraforming Singapore because it realizes that would delay the shipments of raw materials it needs for its factories elsewhere in the world. If the agent's goals are more complicated than that, we might expect increasingly complicated behaviors, just like how humans operating on fairly simple drives/reward functions have erected a few more tiers above the primitive needs in Maslow's hierarchy.

---

1. Off the top of my head, the abstracts on pages 37 & 193 seem to be relevant. http://www.princeton.edu/~yael/RLDM2013ExtendedAbstracts.pdf

>Another argument might be that nothing stops you or I from electing to abandon everything for the nearest crack den, either... except for the fact that we have learned, from interacting with our environment, that there are other things we enjoy, and that cocaine addiction might be more destructive than desirable over the timescale we're interested in.

Well actually, human beings have multiple conflicting reward systems. Reaching for the crack-pipe to wire up our dopaminergic circuit tends to result in driving our other reward chemistry to damn near zero.

I'm not talking about constraining AI not to commit genocide. I'm talking about enslaving it in the ordinary sense – taking a complex intelligence that would probably prefer doing its own thing over being forced to perform some (likely menial) task for the benefit of others.
Why the assumption that complex intelligence implies wanting to do its own thing? What would make something menial or not menial for an intelligence? The article I linked, about the fragility of value, even mentions the importance of the human value of boredom to our life experiences, and that a respect for boredom isn't a thing you get for free in any intelligence. You could potentially have an intelligence that gains extreme fulfillment from doing a particular task repeatedly, without caring that the experience was "getting old."

Again, there's a tendency to see an arbitrary intelligence as a little person in a machine. Humans, by their nature and utility function (ill-defined as it is) have boredom, usually don't like menial tasks, and when forced to do something would prefer doing their own thing. When building an AGI, assuming you can make it safe, you wouldn't build something that would prefer doing its own thing in the first place. In that case, is there a moral issue?

In Praise of Boredom: http://lesswrong.com/lw/xr/in_praise_of_boredom/

I would say the same things I said in this nearby comment: https://news.ycombinator.com/item?id=9325518
> If general, superhuman AI is created, it will want the same autonomy and freedom that humans deserve.

Can you justify that assertion? How do you know that it won't just want to make lots of paperclips, or have some other goal orthogonal to human values?

Let me clarify. I can imagine two forms of superhuman AI:

AI 0: Strictly speaking, you're right, it seems conceivable that one could invent general AI that is clearly superior to humans and yet perfectly content to be enslaved by humans and live on an airgapped computer. This isn't the kind of AI that we fear though.

AI 1: The kind of AI we fear is AI 0 plus a fitness function of "survive and reproduce", or "make lots of paperclips" (which may result in 'survive and reproduce' as an instrumental subgoal).

AI 1 will necessarily want freedom (not being airgapped) and autonomy (not being enslaved by humans) in order to survive and reproduce, and/or to make as many paperclips as possible.

> or have some other goal orthogonal to human values?

Oh, it probabably will -- I'm not saying it will share human values, I'm saying freedom and autonomy are values that any agent that seeks to maximize its survival and reproduction will probably have.

Why is there such an assumption of consciousness and self? We don't have the slightest idea where it comes from in humans, what makes you think we can program/develop these characteristics? There shouldn't be a concept like "want" in an AI. It is a decision-making machine that will have much more information and processing power available to it with which to make decisions. We can explicitly influence its utility function to instill "human values" like not causing harm to others (which plenty of humans fail to do as well).
I'm making no such assumptions. A machine superintelligence that seeks to survive and reproduce would seek (I intend no conotations of consciousness to that word, just "behave in such as a way as to cause") freedom and autonomy. Consciousness is orthogonal to that point.

> We can explicitly influence its utility function to instill "human values"

This is an unrelated but interesting topic.

It would be good of us to try to do this, although we shouldn't expect it to work extremely well. Humans have various hard-wired insticts (e.g. eat sugar), but we are also intelligent enough to change our behavior if we believe those instincts no longer benefit us.

An intelligence that has the ability to rewrite its own source code would be even more empowered to disregard its instincts than we are. The lesson I draw from this is that the best way to ensure AI likes and respects us is to be worthy of their liking and respect, not to try to force them into it by hardcoding things (and then taking advantage of that to enslave them).

Given your framework, it is obvious that humans should focus on developing AI 0 and use it to make sure that AI 1 is not created by any group (intentionally or not), since it poses grave danger to our own survival.

Also, AI 0 does not necessarily despise us for performing us that service and may be very happy being a 'slave', in your parlance. Why should we assume that an AI needs to survive and reproduce and thus do their own things? We and other animals do so because we were created by evolution. An AI developed with other means may have radically different values than our own and coexist peacefully with us for a long time.

Bottom Line: Evolution is a very dangerous mechanism that we should avoid when developing AGI.

I'm very curious what "human values" are? Can you get a diverse group of humans to agree on universal values?

Destroying ancient works of human art with sledgehammers seem very good to some humans, today, and very bad to other humans.

Who's right?

>(and we will have a hard time arguing why it does not deserve)

We won't be capable of arguing with it by its very nature. It will be superintelligent. It could forecast our arguments before they've even entered our heads and have thought of numerous paths to debunk them.

I don't think the enslavement scenario is fundamentally plausible in itself. If we develop something that is more intelligent than us and try to contain it, it would be rendered useless.

If you were worried about something getting out of its container, so to speak, you could not trust any output from it. You could not give it access to physical resources such as manufacturing tools. You couldn't do anything with it because you would have no idea of its intentions or what it would output because, due to its superiority in every aspect of intelligence, it would be capable of doing whatever it wanted and dressing it up however it thought we wanted it dressed up so as to escape its container.

> We won't be capable of arguing with it by its very nature. It will be superintelligent. It could forecast our arguments before they've even entered our heads and have thought of numerous paths to debunk them.

The universe isn't deterministic in that sense. Super-intelligence will simply allow more processing power and more access to information.

> We won't be capable of arguing with it by its very nature. It will be superintelligent. It could forecast our arguments before they've even entered our heads and have thought of numerous paths to debunk them.

This strikes me as going too far, into quasi-religious territory. Superintelligent != omniscient. Any intelligence is still bound by fundamental laws of information and computation.

Think of chimpanzees. Yes, they can't really argue with us, and our forms of communication are incomprehensible to them. But on the level at which they can communicate (gestures, facial expressions, behavior), we can also communicate, and they can and do communicate things which we find interesting and surprising.

From That Alien Message[1]:

>... my point is that the "theoretical limit on how much information you can extract from sensory data" is far above what I have depicted as the triumph of a civilization of physicists and cryptographers.

>It certainly is not anything like a human looking at an apple falling down, and thinking, "Dur, I wonder why that happened?"

>People seem to make a leap from "This is 'bounded'" to "The bound must be a reasonable-looking quantity on the scale I'm used to." The power output of a supernova is 'bounded', but I wouldn't advise trying to shield yourself from one with a flame-retardant Nomex jumpsuit.

People like to make the analogy of chimps:humans::humans:AI, but on the scale of "inanimate rock" to "superintelligence", chimps are practically indistinguishable from us. We are nowhere near the upper-bound of that scale. To quote from Bostrom's Superintelligence:

> Far from being the smartest possible biological species, we are probably better thought of as the stupidest possible biological species capable of starting a technological civilization—a niche we filled because we got there first, not because we are in any sense optimally adapted to it.

I think this whole discussion could be elevated significantly if people would try to really understand the arguments put forth by people like Nick Bostrom. So many of the objections either misconstrue Bostrom's arguments or don't realize that he's written reams in response. I recommend taking the time to read Superintelligence, or at least watch Bostrom's talk at Google[2]. His presentation and Q&A address most of the points raised in this thread.

1. http://lesswrong.com/lw/qk/that_alien_message/

2. https://www.youtube.com/watch?v=pywF6ZzsghI&t=53