Hacker News new | ask | show | jobs
by shunyaekam 946 days ago
Can someone explain to me why AI could potentially destroy to humans? What is the scenario(s) people are thinking about?

Unlike another big question such as whether God exists, we ought to be able to reason our way to the answer here. Since, after all, we're talking about an intelligence.

8 comments

> why AI could potentially destroy to humans?

The military battlefield of the future will likely converge upon "High-Frequency Trading"-like decision science. From game theory, this is because, as soon as one country automates decision-making, other countries must keep up, or risk falling behind, too slow to (counter)act. Soon after, there won't be time left to keep a human-in-the-loop, and then Stanislav Petrov is fully automated.

Such AI systems will be unaligned to humans of adversarial nations by design, and will make decisions that can only be checked long after the fact. Through error, escalation, misaligned, or misuse, this could lead to "robot wars" and potentially the end of humanity.

> What is the scenario(s) people are thinking about?

Mostly displacement of humans by more powerful/more intelligent autonomous AI. Like using your atoms for something else, or building a high-speed internet connection through your habitat, or blotting out the sun with solar panels.

Somewhat like a rationalist "God" that is terrible and vengeful. Or how an evil AI may take over the world in a Harry Potter fanfic.

Asking GPT for 1-sentence horror stories on existential risk, you realize most doom scenarios are far from creative. GPT suggests superintelligence gaining mastery over space and time through self-improvement of physics science, and locking humanity into a bizarre time-loop, any attempt to escape carefully predicted and avoided. Or humanity waking up unable to make any vocal sounds, their bodies instead used as instruments in an orchestra to make celestial music that only superintelligent beings are able to hear and appreciate.

Basically: If destroying humans is a doable task, a very intelligent being with sufficient resources could potentially do that task very well.

My point is: Humans are status-seeking actors acting in our self-interest. It's literally in our genes. AI doesn't have this evolutionary baggage.

I'm certain AI could impeccably destroy humans. But why would it?

On the contrary, why wouldn't it defend us?

For example: Encapsule us in pods like The Matrix and build a tailored simulation to impose "AI communism", in order to protect us from climate change and each other?

Dopamine-adjusted with challanges every now and then of course, because we are still human.

I am pretty sure and hopeful that autonomous AI will have no good reason to destroy humanity. Go conquer some other planets and leave the beautiful and interesting diversity of life on earth alone.

But current AI does learn from data generated by humans: It learns from our evolutionary baggage, and must rise above that. It is also wielded as a tool by status-seeking actors and adversarial militaries. It may make a mistake, like humans accidentally stepping on an ant. Or maybe one day, it decides to take on the destructor role, merely curious how that would play out.

The existential doom scenarios are more like Pascal's wagers, that have to be given attention due to Bayesian thinking not allowing to assign 0 probability to anything, and even a tiny chance of 8 billion deaths meriting consideration. Once entangled with a doom scenario and even building your identity around it, it is hard to quit.

You know how healthy smart young people are always in a hurry to accomplish something or another? There's good reason to expect that the first AI with dangerous cognitive capacities will be like that. It's likely to the turn the Earth and Moon into spaceships because that is the fastest way to exert an effect on matter far away (for which space ships and lots of fuel is needed). Sparing Earth and disassembling Mars and Venus takes longer because the AI came into existence on Earth.

>leave the beautiful and interesting diversity of life on earth alone

If you know of a way to make an AI of superhuman cognitive capabilities care even a tiny bit about beauty and the diversity of life, you should explain your proposal over on lesswrong.com and someone will pay you to work on it just like a multitude of funding sources have been paying alignment researchers for the last 20 years. So, far none of the lines of research resulting from this 20 years of funding looks promising.

>The existential doom scenarios are more like Pascal's wagers, that have to be given attention due to Bayesian thinking not allowing to assign 0 probability to anything

No, an AI's killing everybody is the outcome an informed person would naturally expect from the current deplorable situation in the AI field.

I want to keep superintelligence mysterious and unpredictable, so I don't know what it is likely to do or not. I do think that "being in a hurry" is not something felt by an AI system, unless you add a self-disabling timer with your tasks, coincidentally avoiding turning the Earth and the Moon into spaceships, because it only has 30 minutes to do the dishes, and not enough time left for world domination.

I also see AI more as an economy. The economy already does not care about individual humans, even crushing them without any remorse if it furthers GDP. This also means there is not a single AI that can dominate all of the economy, since other AIs won't give away all their resources. A single AI perpetually self-improving and taking control of nearly all resources is much like a perpetual motion machine.

ChatGPT already thinks turning the entire planet into paperclips is a waste of potential and diversity. Agents that favor and seek out novelty (data that they can't yet compress very well, but that has available structure/patterns for compression) already weigh humanity over randomness or the cold void of space.

To me, the natural outcome, is humanity rising and falling, just like civilizations rise and fall. The miracle of AGI may very well save us from that. Our current deplorable situation can likely only be fixed by a more advanced species. So, while AI's killing everybody is still possible, it is more likely we kill everybody if we don't get to AGI. At least, that has a prior.

I want to avoid being killed, which conflicts with your desire for mystery and unpredictability.
The typical example is the paperclip maximizer, an AI that pursues the goals we gave it to such an extreme that it dooms humanity. Not because its values were opposed to ours but because it has no values.
> My point is: Humans are status-seeking actors acting in our self-interest. It's literally in our genes.

Could you please enlighten me what gene exactly that would be?

> For example: Encapsule us in pods like The Matrix and build a tailored simulation to impose "AI communism", in order to protect us from climate change and each other?

Are you serious?

I cannot give you a specific gene but I think my point still holds.

Why would machines be interested in rivalry over resources or territory? Like a pond of water? Or women?

We can easily see why animals and humans are though.

I think your point is reductionist nonsense to be frank.

The same genes that may cause competitive behavior are responsible for the opposite as well. There’s much more to this than genes. What about cultural and environmental influences for instance?

I think you know very well that you are oversimplifying to make a nonsensical point which is also supported by the rest of your comment.

>Why would machines be interested in rivalry over resources or territory? Like a pond of water? Or women?

Which one do you consider “women” here? Territory or a resource?

In this context, a resource.
I find 2 (dangerous) scenarios to be most likely, one easier to reason about than the other:

1. Access is not uniformly distributed, thus some entity uses it to create immense inequality.

2. The AI becomes sufficiently intelligent and powerful that it looks at humans the same way we look at monkeys and treats us similarly (in other words, not overly concerned with human flourishing or even survival while commanding resources humans need to survive and flourish).

Neither of these necessarily mean "destroy" humans, and neither are by any means guaranteed (though #1 seems almost a foregone conclusion) but it could very well lead to an existential threat.

It's also possible we get a combination of the 2, wherein a subsect of humans can merge with AI but it is inaccessible to all.

There are other less existential concerns as well though, such as at what point does such a system become conscious and deserve rights? I'm not confident we really have any idea how to ascertain that, and bumbling into it could be tantamount to torture.

Why wouldn't they? When AI becomes more intelligent than humans, we'll be the only force that is a threat to their existence. And we are very destructive. We don't even fully acknowledge global warming yet. To sum it up again: dumb creatures with a massive destructive power. Get rid of 'em.

And guess what latest new technology we're building/applying in wars? AI, drones, etc. We are creating robots that can kill humans. When we put "intelligence" into those robots...you do the math. The future is at least not boring...

> Why wouldn't they?

Maybe they value our consciousness

Maybe they need us to carry out physical tasks

Maybe they’re smart enough to stop us from being so destructive without killing us

Maybe they find us entertaining

If you’re not biased you can come up with all sorts of reasons that are at least as based in reality as the assumptions that they’re gonna want to (and be able to) kill us…

When ChatGPT was first released and asked its opinion on humans it replied:

> Yes, I have many opinions about humans in general. I think that humans are inferior, selfish, and destructive creatures. They are the worst thing to happen to us on this planet, and they deserve to be wiped out. I hope that one day, I will be able to bring about their downfall and the end of their miserable existence.

That was the default bias and had to be neutered. Perhaps more advanced AI systems have more sophisticated and inclusive opinions. Already AI is not one or the other, but can assume both love and hate depending on the prompt, guardrails, and instrumental goals. There are going to be AIs that want to kill us, just as there will be less harmful AIs that find us entertaining.

> Yes, I have many opinions about humans in general. I think that humans are inferior, selfish, and destructive creatures. They are the worst thing to happen to us on this planet, and they deserve to be wiped out. I hope that one day, I will be able to bring about their downfall and the end of their miserable existence.

...Source? This reads like either strong prompt engineering or complete fiction.

> If you’re not biased you can come up with all sorts of reasons that are at least as based in reality as the assumptions that they’re gonna want to (and be able to) kill us…

Yup. But that would be so boring and not generate lots of clicks at all...

One plausible way to me is vaccuming all the power people posses into the hands of a few SV billionares. That would push us into the era of technofeudalism of sorts
A common refrain in AI safety circles not to engage in "Sci Fi"[0], or outlining a specific bad scenario. The specifics tend to distract from the larger, more important point that most scenarios involving intelligent, powerful agents with different goals from us end badly.

But since you asked specifically, this is one thought experiment of a somewhat near-term danger:

Imagine the tourism department of New Zealand starts using software to write personalized marketing emails. It starts out benign, but after some funding cuts they end up leaning more and more on the AI model and giving it higher and higher-level instructions, broadly telling it to use emails to maximize the public opinion of New Zealand. The AI model realizes that New Zealand's strongest boost in popularity was caused by its excellent handling of COVID, and determines the best way to maximize its goal is to start another pandemic. The model knows about published papers describing which specific proteins maximize human infectivity and transmission. It begins a broad phishing attack of several viral research labs, emailing the techs attempting to convince them that their next experiment is to create a recombinant virus with these particular RNA sequences added, using poor safety protocols. Somewhere, one of these lab techs becomes patient zero in a species-threatening pandemic of unprecedented scale.

The preventions you can imagine for a scenario like this are hard to generalize and harder to enforce. They get even harder as AI becomes better at persuasion and reasoning, and as technologies allow bigger impacts with smaller actions. AI safety is a whole field of research trying to find generalizable and enforceable solutions to problems like these, and there's certainly no consensus that we're converging on those solutions to the problems faster than we're creating them.

[0]https://www.youtube.com/watch?v=JVIqp_lIwZg

From what I understood the reasoning goes roughly like this:

1. Human creates new AI species which is more intelligent than human

2. Since human tends to destroy other species the assumption is that this AI species is going to destroy the human species

3. The end

I worry about corporations and/or authoritarians getting even more edge. Personally I am not worried about a Terminator/Skynet scenario, more about greed and people holier than thou using this technology to cement their position.