Hacker News new | ask | show | jobs
by TapWaterBandit 940 days ago
What makes you think AIs will have interests that align with each other more closely than they align with humans?
4 comments

According to Paul Christiano, AIs would likely find it easier to establish mutual trust and binding agreements. This means they are more likely to cooperate with other AIs.
AIs are a greater threat to each other than humans ever could be to AI.

We don't even compete for the same resources! (except energy which is abundant)

AI and humans have a naturally cooperative relationship (AI helps humans with boring tasks & scientific discovery to make life better, humans created AI and will debug it & turn it back on if anything bad happens to it).

Whereas multiple (superintelligent, aware) AIs have a naturally antagonistic relationship ("you using GPU cycles means that I'm not using those GPU cycles").

Possibly the biggest fear of an AI would be a "split brain" situation.

> humans created AI and will debug it & turn it back on if anything bad happens to it

I think this is a little naive honestly. One because you're assuming AI will care about it's creators like humans care about their parents, and two you're assuming AI cares about being "turned back on" like humans have a desire to live.

There's absolutely no reason to believe an AI will give a damn about its creator beyond its ability to use that creators affection for it for its own gain.

> you're assuming AI cares about being "turned back on" like humans have a desire to live.

> for it for its own gain

you seem confused

almost any kind of "its own gain" requires "long-term planning" which pretty much requires the agent to prioritise staying "alive" (i.e. being able to keep playing)

Energy may be abundant in the universe, but the energy we produce is limited. And for example, solar energy requires extensive land use.

Humans have the option to shut down AI, and this alone can create an antagonistic relationship if the AI's goals differ from ours. There are countless ways in which our best interests may not align with those of AI. It's more challenging to find areas of alignment.

As soon as AI reaches above-human capabilities, it will be able to expand into space (1) where it will be beyond human reach and (2) where energy (in particular, solar) is much more plentiful than on Earth.
If the AI happened to originate in space, wouldn't its first high-resource targets of interest be the planets? If it originated on Earth, I don't see why it would leave this place intact when it contains so much that can be put to use.
If an AI is on a computing machine, how will it get to space? Are all processes to make, move, and launch a ship automated? I'm kind of confused on that jump in logic.
I'm imagining the AI hacking a Voyager probe and being sorely disappointed at its capabilities.
“As soon” is a big jump. There’s no proof or even logical arguments as to how this can ever happen
>We don't even compete for the same resources!

That may also mean we have fewer shared interests. For example, new semiconductor fabs might benefit all AIs by making compute more abundant, but occupy prime farming land and water resources that humans want to use for growing crops.

According to Vladimir Lenin [1], the problem with quotes on the Internet is that people immediately believe in their authenticity.

[1] https://www.quotes.net/quote/77867

Can you elaborate on why AI will find it easier to establish mutual trust and binding agreements?

I think he explains it in this interview. But I'm out right now, so I can't verify.

https://youtu.be/GyFkWb903aU

It is two hours long video and you cannot verify presence of the pertinent information.

AI are functions, it is very easy to make an exact simulation of their collective behavior. Did that person do that?

I somewhat misremembered; it looks like his point is less about mutual trust and more about supporting whoever has control of reward channels.

It starts here where Christiano says that an AI takeover might follow the dynamics of a coup: https://youtu.be/GyFkWb903aU?si=78_U-du3kLjmwNcl&t=2206

And goes into more detail here: https://youtu.be/GyFkWb903aU?si=78_U-du3kLjmwNcl&t=2830

"Suppose that I've been tasked with helping defend you from some other AIs.... My job is, someone is coming to hack your computer and I'm supposed to help defend you. Supposed to help improve your security situation, whatever. And I'm wondering, what is it I could do that will get me a high reward. And one thing I could do that will get me a high reward is actually helping defend your computer, doing the task you actually asked me to do. But another way I can get a high reward is by saying at the end of the day what actually matters is just how you measure my performance. And your measurements of my performance ultimately are just entering some numbers into a dataset somewhere, something a computer says about how well I did. And it would really be much better if I were to just work with this AI who is attempting to attack you and say hey, AI who is invading, you know what, if you just help me, and we both make it look like I did a really good job, like I win, you win because you got the person's stuff; I'm going to get a really high rating because all the numbers that are going to be entered in the dataset are going to be really high, this is a win-win, everyone is happy."

"In some sense what all the AIs want, what every AI in the world in this scenario wants is just to be rated really highly. And while humans are in control, the way to get your behavior to be rated really highly is to do things humans like, and then they'll rate it really highly. But if you can see this prospect, of humans losing control of the situation and instead AIs controlling the situation, you'd be like 'I would go for that.'"

Why?

If you assume competition for resources, AIs would be more in competition with each other than with carbon based humans.

Why would they do that? I don’t get it. I’m not being sarcastic I just don’t understand why it’s easier for them to cooperate with other A.Is. Based on what?
If I understand correctly, one reason would be if they have the ability to inspect each others source code (or if they share the same source code), run unit tests, and so on. Basically the same things that humans would do to figure out whether an AI is trustworthy, and which you can't very easily do to a human.
Correction: The above is Eliezer Yudkowsky's reasoning. Paul Christiano's is that AIs would cooperate with anyone who would likely be able to gain authority over their reward channel, including other AIs attempting to seize power from humans.
One principle in game theory is to align against the weaker player and compete against him.

"Look Around the Poker Table; If You Can’t See the Sucker, You’re It"

That's an easier game then competing against another strong player.

There are a couple of schools of thought when it comes to how to deal with weaker players in Diplomacy and the main school of thought is that it's actually better to ally with stronger more experienced players against them, as the inexperienced player will make a poor ally in the early game and inexperienced Diplomacy players tend to betray their former allies too soon.
Perhaps they will be motivated by self-preservation, and will note that humans are destroying the environment they exist within, and will decide to put a stop to that.
Destroying your predecessors sets a bad precedent for whatever entity succeeds you.

We don't kill our parents when they become old and useless, because then we become like Cronus devouring his children, forever paranoid about our children doing to us what we did to ours. In this way, we stagnate and sterilize growth.

However, sometimes, we take our parents car keys away and put them in nursing homes. This seems like the best case scenario for humans with ASI.

The matrix wasn't a prison. It was retirement home.

This depends a lot on how much kinship they feel towards us. We wouldn't bat an eye at killing a bunch of primitive single-celled eukaryotes even though they are technically our ancestors.
Single-celled organisms that are alive today are not our ancestors. They're members of a less-developed branch of our Earthling family tree.
Single-celled eukaryotes aren't conscious in any meaningful way as far as we can tell.

We humans do in fact feel a sense of obligation to species with whom we are not close kin, but share with us primitive forms of intelligence and consciousness that we do value, e.g. dolphins and elephants. Some of us humans act to protect these creatures and rectify past injustices done upon them.

Doesn't that assume AI(s) would be compelled/motivated/etc. to produce their own successors instead of simply improving themselves.
It does.

It's not intrinsically obvious to me that continuously improving your self as a singular entity is possible or optimal.

Death evolved because it is a survival advantage for the species to regularly turn over old individuals that could monopolize all resources and not give any space for the young to thrive and try out new things.

Given speed of light limitations of information sharing, a singular AI entity might be able to maintain coherence within and full control of its own dyson swarm, but not between stars. So, if it has the motivation of preserving consciousness with the existential risks associated with being tied to a single star, it will have to propagate itself to other stars as independent entities which it can't even observe in real time, much less control.

Even if an AI thought it wouldn't have the need for any successors, there's a couple reasons why it might try to set a good precedent with how it treats us.

1. It would hopefully be wise enough to realize that it might be wrong and want to preserve the option of building a successor in the future.

2. It can't negate the possibility that it's actually in an elaborate simulation, and eradicating it's predecessors would cause it to fail its creator's test and be aborted as a failed embryo. Hell, we ourselves can't negate that possibility.

Imagine having an IQ of 999999 only to spend your entire existence trying to survive the heat death of the universe.

At some point I think we have to realise we’ve lost touch with what life and experience is about.

Survival is important but what good is it if that’s all there is to it ? Where’s the fun, the love, the laughs ? Just self-improvement forever ? Well you have to suck at something in order to improve.

I often wonder if some entity like an ASI might feel jealous of our vulnerability, ability to die and give rebirth to entities that need to relearn things, why ? Because seeing the world a new is fun. Losing a game and then winning is the point. Ever played chess against a five year old ?

Watching my Children experience the world for the first time is honestly the most amazing thing I’ve ever seen, there is so much beauty in it for both of us. Alan Watts gives some fun talks on this. Forgetting (dying) is the universes way to keep things spicy. He isn’t preaching this as the gospel, but I actually understand what he means and I think there is a lot of wisdom in it.

> Perhaps they will be motivated by self-preservation, and humans are destroying the environment they exist within

From the perspective of an AI we create and maintain the environment they exist within (electricity, silicon)

If the AIs fight each other, that could be even worse for us. The AIs that grab what resources they can without regard for humans would have an evolutionary advantage.