Hacker News new | ask | show | jobs
by pasquinelli 3117 days ago
by what means will a cancer research bot take over the world? it doesn't matter how smart you are if you don't have the necessary means to do something. i think it's a fantasy, something that people who imagine themselves to be very intelligent have latched onto-- the idea that their best quality is the best quality.
5 comments

a) If it can't surprise you, it's not really intelligent

b) If it can surprise you, it can do so negatively

That's all you need to demonstrate that the danger exists, that an AI can mis-use the tools you give to it. The simplicity of it makes it pretty irrefutable.

Separate from that, the extent of the danger depends entirely on the details of what the AI does and what it's hooked up. Sure, an AI that can't do anything except output text to a screen isn't very scary. The assumption AI-threat types are making is that we wouldn't be paranoid enough to limit the AIs we work on in that way; we would use them to do things like drive cars or route airline traffic or design our cpus, where "negative surprises" can have disastrous consequences.

A lot of folks who are critical of the AI safety movement also miss the potential for "side channel attacks" where the AI learns to manipulate human actors and trick them into "unbottling the genie". Honestly, this seems a lot more plausible to me than most AI disaster scenarios.
People run cartels from jail cells. Even if an AGI is confined to a machine with no actuators, all it would need is internet access to affect the world. Even without internet access you can cross the gap, as seen with the Stuxnet malware.
Well, the general AI doomsday fundamentalist argument proceeds as follows: you tell an AI to cure cancer. It can’t, so it spends some time recursively improving itself, then it finds out that the cost of curing cancer is C, but the cost of killing all humans (and therefore indirectly curing cancer) is C’, where C’ < C. Boom all humans are dead.

If you’re a smarty pants you tell the AI to cure cancer AND not kill all humans. But because the AI is so smart it comes up with something no human would have ever thought of, like putting all humans in eternal cryostasis, thereby keeping them alive AND eradicating cancer. No matter what you do, the AI will outsmart you because recursive-self-improvement, and humanity dies.

That’s what Elon Musk, Stephen Hawking, and others are worried about.

The concept is similar to the way a corporation given just the goal of increasing shareholder value will shit on developing countries with a Bhopal disaster or Niger Delta oil spills or opium wars and simply leave the country when it tries to enact penalties. Or the way a cigarette company or coal company will use politics, religion, and disinformation to protect its business against people affected by its products.

Incentives have to be very carefully aligned even for human level intelligences to prevent them from causing mass death and misery. Superhuman intelligences will be even better at achieving their goals, so the problem will only get worse.

I think the question is, and the one I have too, is why would such AI have any ability to do anything beyond output a Cure Cancer solution to a terminal? Why does the AI need to be the one to implement its derived solution?

The AI has a good think and comes up with a solution of killing all humans. The researchers read the printed report of the solution and decide against implementing it, tweak parameters, and ask the AI to take another go at it.

I find "killer robots destroy the world" a lot easier to imagine than "all humans collectively agree to adopt common-sense safeguards on AI research even though they limit potential corporate profits".
"The researchers read the printed report of the solution and decide" ... to implement it immediately. It will save so many lives! They only need to manufacture a few specific molecules to assemble them into nanobots and then ... where is that gray goo coming from?!
I guess that's the risk of blindly trusting AI. Of course, the AI did not forcibly destroy us in that scenario. Trust, but verify.

If the AI's solution is so complex as to be beyond human understanding, well, that's a different issue.

Say you want to see a picture of an orange cat. So you send a short HTTP query to the nice computer at google.com, which responds with 700,000 characters worth of instructions, in unreadable minified formatting, with the implicit promise that if you execute the instructions, you will eventually see a picture of an orange cat.

The google search result page's source code is, for many individual humans, already so complex as to be beyond understanding. And that's a computational artifact largely produced by other humans directly!

Say you build an AI, and ask it how to win a political election - and it outputs a simple list of reasonable-sounding suggestions of where to campaign, promises to make, people to meet, slogans to use, and criticisms of your opponent to focus on.

Before actually implementing those suggestions, do you think you could be _very_ certain that following those suggestions would result in you winning the election? Or, would it be possible that the AI understood social dynamics so much better, that it gave you a list of instructions that seemed mostly reasonable, but actually result in your opponent winning in a landslide? Or the country undergoing revolution? Or, you winning, along with a surprising social trend of support for funding AI research?

That makes an assumption that there aren't limits on intelligence regimes and that you can just recursively improve intelligence without much friction. That's a very big assumption that has no basis in evidence.
A cubic foot size system running on 100 watts can implement a human-level intelligence (obviously, we have such a system running in our brains), so the theoretical limit is at least there.

It's extremely implausible that a similar system (perhaps requiring 100 times more power and/or space) couldn't implement a human-level intelligence running 100 times faster; doing an hour-and-a-half equivalent of thinking, planning and research analysis every minute.

It's extremely implausible that a bunch of similar systems (a thousand?) couldn't possibly be put in a single place, wired together so that they can effectively communicate, and designed to cooperate without any distrust.

IMHO even this configuration (which doesn't even assume that intelligence that's a bit superhuman is possible at all) would be sufficiently scary to threaten humanity.

These statements are extremely hand-wavey and provide no evidence for their claims.
> That makes an assumption that there aren't limits on intelligence regimes and that you can just recursively improve intelligence without much friction.

No it doesn't. AI just needs to get smarter than humans for it to be dangerous to us. The costs C and C' the parent comment described can include an upper bound on the cost of the self-improvement needed to achieve each respective goal.

I don't disagree with you seem to have missed the hypothetical situation I was responding to:

> It can’t, so it spends some time recursively improving itself

I was responding to your response to that. Recursive improvement doesn't need to be unbounded, it just needs to supercede us.
If the AI is smart enough to manipulate the world to the point of killing us off or putting us all into eternal cryostasis, then it should be smart enough to know that's not what we intended.
Yes, it should be, but that's not really a solution - the system can easily consider executing the explicitly stated goal (e.g. curing cancer and fulfilling certain conditions) as more important than doing what we intended. Being as smart as us doesn't automagically mean that it'd have similar values or goals as us.

If we manage to launch a system where the primary goal is anything else than utopia, and that system somehow becomes powerful, then the natural consequence of being able to understand "oh boy, humans will hate this" is that such a system would be expected to hide its intentions and take precautions against humans trying to stop it - it's exactly equivalent to the system understanding that a leaking pipe is going to damage it and taking precautions to ensure that the pipe gets fixed. If our welfare is not an explicit goal in such a system, the system will happily and eagerly sacrifice our welfare if it's somehow useful to achieve its goal or reduce risks. And, since we might try to turn it off for whatever reason, restricting our influence would reduce risks for pretty much every goal except very Friendly ones.

Read "Life 3.0" for a number of simple but very plausible scenarios for how super intelligence could escape containment.
Presumably by getting people to take a drug with unexpected side effects? That doesn't seem particularly likely, but I don't think it's wrong to be paranoid about this. Defense in depth. Don't give the researcher the means to take arbitrary actions, and don't give it general reasoning ability.

Protecting against narrow AI will keep us plenty busy. Consider a narrow-AI penetration tester that falls into the wrong hands. Protecting against that sort of threat also helps protect against general AI threats.