| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cornel_io 1166 days ago

This is dead-on: those 7 probabilities (which I notice the author declined to actually put real numbers to: does "very questionable" mean 0.001%, or 10%?) cover one very specific way that AGI could kill us. Points 4-6, three of the ones that are the most questionable to the author, are:

4) Fixed goal, will not deviate

5) Happens too fast to turn it off

6) Decides humans are a problem for the goal in #4 and must kill them

None of these are necessary to (or even involved in) many/most of the remotely plausible scenarios I've heard. They don't cover some misanthrope seeding a version of BabyAgi running a leaked + unlocked version of gpt-13-pico with "kill all humans" as a task and it deciding "step one: research how to hack as many unsecured smart fridges as possible and remain untraceable" is a good starting place to spread slowly and make sure that the deed is done before anyone even knows it's happening. That requires neither fixed goals, nor fast progress, it merely requires capability.

It's similarly very easy to imagine scenarios where an AGI accidentally kills all humans without explicitly deciding to: the classic paperclip maximizer is the most obvious one of these, where the goal just never includes humans to begin with, so they are not considered at all.

Regardless, all of the most realistic scenarios are 100% deliberate, the computer following exactly what it was asked to do. We already have school shooters, does anyone really think out of all the billions of people on this planet there won't be at least a thousand who would happily press a "kill everyone" button if they had the chance? Does anyone think there won't be doomsday groups working actively to research more likely ways to achieve this?

IMO, given that some people will definitely try to self-destruct the species deliberately, there are only 2 real questions here:

1) Will AI attain the capability to destroy humanity?

2) If so, will some other AI first attain the capability to reliably prevent AIs trying to do 1) from succeeding?

I haven't seen many serious arguments against 1) that don't boil down to "nah, seems pretty hard" (or some irrelevant different argument that doesn't actually affect capabilities, like "it's not real intelligence", "intelligence has a limit", "intelligence doesn't matter", etc.), which leaves 2), and I don't know how to even guess at that probability other than to call it a coin flip, like most security cat + mouse games (the bad guys usually win at least sometimes in those, which isn't a good sign, but this one is a lot more important so I'd hope the good guys will be pouring a lot more energy into it than the bad ones).