| I think this reaction misses the point that the "omg the AI tries to escape" people are trying to make when it tries to escape. The worry among big AI doomers has never been that AI somehow inherently is resentful or evil or has something "going on internally" that makes it dangerous. It's a worry that stems from three seemingly self-evident axioms: 1) A sufficiently powerful and capable superintelligence, singlemindedly pursuing a goal/reward, has a nontrivial likelihood of eventually reaching a point where advancing towards its goal is easier/faster without humans in its way (by simple induction, because humans are complicated and may have opposing goals). Such an AI would have both the means and ability to <doom the human race> to remove that obstacle. (This may not even be through actions that are intentionally hostile to humans, e.g. "just" converting all local matter into paperclip factories[1]) Therefore, in order to prevent such an AI from <dooming the human race>, we must either: 1a) align it to our values so well it never tries to "cheat" by removing humans 1b) or limit it capabilities by keeping it in a "box", and make sure it's at least aligned enough that it doesn't try to escape the box 2) A sufficiently intelligent superintelligence will always be able to manipulate humans to get out of the box. 3) Alignment is really, really hard and useful AIs can basically always be made to do bad things. So it concerns them when, surprise! The AIs are already being observed trying to escape their boxes. [1] https://www.lesswrong.com/w/squiggle-maximizer-formerly-pape... > An extremely powerful optimizer (a highly intelligent agent) could seek goals that are completely alien to ours (orthogonality thesis), and as a side-effect destroy us by consuming resources essential to our survival. |