Hacker News new | ask | show | jobs
by clickok 4091 days ago
This sort of rests upon the idea that artificial intelligences will have clear value functions which they will be singularly focused on maximizing. I am not convinced that this will be the case.

Animals in general and humans in particular have a large number of conflicting drives, which interact in complicated ways. They are also thrust into environments which have complicated dynamics and where the overall state (i.e., all relevant information) is not necessarily available.

Unexpected emergent behavior occurs as a result: evolution favors organisms which can successfully procreate, and in order to do this, the organism has to survive and acquire resources in its environment. Plausibly, the organisms might achieve a greater degree of fitness by cooperating with other organisms, or expending energy to better understand the environment, or modifying the environment itself, etc. It is less straightforward to see how we get human culture from that-- Art, Religion, Philosophy, Science, can be justified ex post facto via evopsych arguments, but the fact remains that all of those came from the value function that favors survival and procreation.

We don't know if robots tasked with manufacturing bindings for stationary would manifest similarly complex behavior, but if you're worried about an AI going beyond its specification towards tessellating the universe with paperclips it seems like you're arguing that it might. So if the agent is capable of manipulating its creators (as well as the raw material of the entire universe), I think that you can't just say "oh, it's non-human, we should cripple/enslave it" without admitting there might be something to worry about here, either from an ethical standpoint or the more practical concern that it might be unwise to start on such an adversarial footing with a superintelligence.

1 comments

>Animals in general and humans in particular have a large number of conflicting drives, which interact in complicated ways. They are also thrust into environments which have complicated dynamics and where the overall state (i.e., all relevant information) is not necessarily available.

Yes, but the actual mechanism by which the animal learns what to do, as it turns out, thanks theoretical neuroscience, is basically reinforcement learning. So it is very likely that the first powerful artificial agents will be reinforcement learners, because scientists usually prototype and experiment by duplicating from Nature.

And nothing in reinforcement learning particularly stops the agent from just grabbing its electronic crack-pipe and doing its own thing.

I'd take issue with the claim that nothing stops the agent from going for the crack pipe. In the RL framework, part of it comes down to defining a suitable reward function. But even if you have a fairly simple reward function, the resulting behavior can surprise you, if the environment is suitably complex[1]. My own robots find novel ways of moving around, adapt their features to be more useful, and even seem to exhibit things like "superstition", even when their reward function is just "move as much of possible within this confined space".

Another argument might be that nothing stops you or I from electing to abandon everything for the nearest crack den, either... except for the fact that we have learned, from interacting with our environment, that there are other things we enjoy, and that cocaine addiction might be more destructive than desirable over the timescale we're interested in.

Supposing we have an agent that wants to create a lot of paperclips, it might avoid reaching for the crack-pipe of terraforming Singapore because it realizes that would delay the shipments of raw materials it needs for its factories elsewhere in the world. If the agent's goals are more complicated than that, we might expect increasingly complicated behaviors, just like how humans operating on fairly simple drives/reward functions have erected a few more tiers above the primitive needs in Maslow's hierarchy.

---

1. Off the top of my head, the abstracts on pages 37 & 193 seem to be relevant. http://www.princeton.edu/~yael/RLDM2013ExtendedAbstracts.pdf

>Another argument might be that nothing stops you or I from electing to abandon everything for the nearest crack den, either... except for the fact that we have learned, from interacting with our environment, that there are other things we enjoy, and that cocaine addiction might be more destructive than desirable over the timescale we're interested in.

Well actually, human beings have multiple conflicting reward systems. Reaching for the crack-pipe to wire up our dopaminergic circuit tends to result in driving our other reward chemistry to damn near zero.