|
I think this debate on AGI safety between major AI researchers is quite relevant to those who are non-expert in the area. Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More
https://www.lesswrong.com/posts/WxW6Gc6f2z3mzmqKs/debate-on-... Note that it was in 2019 when we didn’t yet see the capabilities of current models like Chinchilla, Gato, Imagen and DALL-E-2. Sample: “Yann LeCun: "don't fear the Terminator", a short opinion piece by Tony Zador and me that was just published in Scientific American. "We dramatically overestimate the threat of an accidental AI takeover, because we tend to conflate intelligence with the drive to achieve dominance. [...] But intelligence per se does not generate the drive for domination, any more than horns do."“ “Stuart Russell: It is trivial to construct a toy MDP in which the agent's only reward comes from fetching the coffee. If, in that MDP, there is another "human" who has some probability, however small, of switching the agent off, and if the agent has available a button that switches off that human, the agent will necessarily press that button as part of the optimal solution for fetching the coffee. No hatred, no desire for power, no built-in emotions, no built-in survival instinct, nothing except the desire to fetch the coffee successfully.” |
I think Robin Hanson has the most cogent objection to high E-risk estimates, which is basically that the chances of a runaway AI are low because if N is the first power level that can self-modify to improve, nation-states (and large corporations) will all have powerful AIs at power level N-1, and so you’d have to “foom” really hard from N to N+10 before anyone else increased power in order to be able to overpower the other non-AGI AIs. So it’s not that we get one crack at getting alignment right; as long as most of the nation-state AIs end up aligned, they should be able to check the unaligned ones.
I can see this resulting in a lot of conflict though, even if it’s not Eleizer’s “kill all humans in a second” scale extinction event. I think it’s quite plausible we’ll see a Butlerian Jihad, less plausible we’ll see an unexpected extinction event from a runaway AGI. Still think it’s worth studying but I’m not convinced we are dramatically underfunding it at this stage.