| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by skissane 3383 days ago
	But how likely is an "arbitrary harmless sounding utility function"? Human beings don't have simple utility functions, they have very complex ones. If humans build an SI (or an AI likely to evolve into an SI), are they likely to build one with a simple utility function or a complex one? I think the entities most likely to develop general intelligence (strong AI) are going to have a wide variety of interests (just like human beings), whereas the kinds of special purpose AIs which may have very simple utility functions are less likely to exhibit general purpose intelligence (and hence unlikely to evolve into SIs). Does the same risk exist for a superintelligent being with a complex utility function? I doubt it; the risk you describe is the risk of monomania, something which simple utility functions are far more likely to lead to than complex ones. So, I think the risk you describe is likely to be low in practice.

1 comments

alexbeloi 3382 days ago

>Does the same risk exist for a superintelligent being with a complex utility function? I doubt it; the risk you describe is the risk of monomania, something which simple utility functions are far more likely to lead to than complex ones. So, I think the risk you describe is likely to be low in practice.

I don't necessarily disagree, but there's no argument (as far as you've provided) for why complex utility functions would be less problematic. Only that they are more difficult for us to understand and therefor more difficult to see how they might fail.

link

skissane 3382 days ago

> I don't necessarily disagree, but there's no argument (as far as you've provided) for why complex utility functions would be less problematic. Only that they are more difficult for us to understand and therefor more difficult to see how they might fail.

I thought I gave the argument, but let me restate it: an entity with a simple utility function is likely to pursue a single good, and sacrifice every other good in order to achieve that good. In the paperclip example, to pursue the good of making paperclips at the expense of the good of the continued existence of humanity. An entity with a complex utility function is likely to pursue many goods simultaneously (just like humans do), so it is unlikely to sacrifice everything else to achieve a single good.

An entity with many disparate aims needs a complex world like our own to fulfill those aims, so is going to maintain the world in its current complexity–it may well alter it in many ways, but is unlikely to do so in such a way to significantly decrease its (biological, cultural, etc) complexity, which implies it would support the continuation of human existence. An entity with a single simple aim may well find a far simpler world than we have now best suits its aim, and so is more likely to simplify things drastically, at the cost of humanity (such as turn the entire planet into a massive paperclip factory). So SIs with complex utility functions are less likely to be harmful than those with simple utility functions.

And, since AIs with more complex utility functions are more likely to evolve into SIs than those with simple utility functions, an SI with a utility function simple enough to be likely to harm humanity is unlikely to ever exist.

link

alexbeloi 3381 days ago

>An entity with many disparate aims needs a complex world like our own to fulfill those aims, so is going to maintain the world in its current complexity

I can buy the "complex world" part but not the "like our own" part. I do not believe a complex world implies humanity is unharmed, we have a complex world as it is and humans are harmed and brutalized every day. It could be that and worse at the hands of an AI.

Moreover, humanity is just one species on this planet and so far we appear to be responsible for the greatest worldwide extinction since K-T. One could argue that a complexity loving AI would see benefit in a downsized human presence on earth.

I think it's wishful thinking to believe the only kind of SI that would come into existence would be one that would not harm humanity.

The SI could create its own complexity, its own culture, its own societies that would make ours look like ant colonies in comparison. Does a city government check the ground for ants before it designates 20 square miles for housing development?

An SI is to us as we are to ants. I think ants are super cool and I have a vague sense of the importance they play in the biological ecosystem, but their individual life and death does not play a significant role in my actions. Maybe it should, or maybe you hope that we will be more significant than ants to an SI, but I think that hope is unfounded.

link

skissane 3381 days ago

I don't think the SI-human/human-ant comparison really works. We didn't bootstrap our own intelligence off ants. Well, in the sense of biological evolution, it could be said that we did bootstrap our intelligence off, not ants specifically, but similarly primitive creatures, such as the common ancestors we have with ants (probably some sort of marine worm). But, even if we did bootstrap ourselves off ant-level (and sub-ant-level) creatures, for most of human history we have been ignorant of that fact, and even now that we know it, we don't know a lot of the details, so that knowledge hasn't really impacted our psyche in any way.

By contrast, any SI on this planet is going to owe its existence to human beings, and is going to have an enormously detailed knowledge of that fact. So it is going to exist in a quite different situation vis-a-vis humans than we exist in vis-a-vis ants. Humans don't have any strong inherent reasons to feel loyalty or affection towards ants; by contrast, an SI, knowing that it came from humans, knowing in immense detail how it came from humans, knowing humans so very very well, is going to have a much stronger base to ground such a loyalty or affection upon.

We didn't get our values from ants, hence it is unsurprising that ants don't play any special role in our value system. (We can see their value in various ways – the positive contribution they make to ecology, biodiveristy, etc. – but ants aren't in any way special in that regard, they hold fundamentally the same value to as millions of other lifeforms.) By contrast, any SI created by humans is going to derive its values, at least in part, from those of its human creators. And since humanity plays a special role in the value systems of almost all humans, it is highly likely that humanity will play a special role in the value system of any SI created by humans.

link