| Stuart Russell (co-author of AI:MA, one of MIRI's research advisors) argues on http://edge.org/conversation/the-myth-of-ai#26015 that AI systems with "the ability to make high-quality decisions" (where "quality refers to the expected outcome utility of actions taken" and the utility function is represented in the system's programmed decision criteria) raises two problems: "1. The utility function may not be perfectly aligned with the values of the human race, which are (at best) very difficult to pin down. "2. Any sufficiently capable intelligent system will prefer to ensure its own continued existence and to acquire physical and computational resources – not for their own sake, but to succeed in its assigned task." The first of those is what Bostrom calls "perverse instantiation" and Dietterich and Horvitz call the "Sorcerer's Apprentice" problem (http://cacm.acm.org/magazines/2015/10/192386-rise-of-concern...). The second of these is what Bostrom calls "convergent instrumental goals" and Omohundro calls "basic AI drives." The first of these seems like a fairly obvious problem, if we think AI systems will ever be trusted with making important decisions. Human goals are complicated, and even a superintelligent system that can easily learn about our goals won't necessarily acquire the goals thereby. So solving the AI problem doesn't get us a solution to the goal specification problem for free. The second of these also has some intuitive force; https://intelligence.org/?p=12234 shows Omohundro's idea can be stated formally, so it's not purely sci-fi. Averting the "Sorcerer's Apprentice" problem in full generality would mean averting this problem, since we'd then simply be able to give AI systems the right goals and let them go wild. Absent that, if AI systems become much more cognitively capable than humans, we'll probably need to actively work on some approach that violates Omohundro's assumptions (and the assumptions of the formalism above). Bostrom and MIRI both talk about a lot of interesting ideas along these lines. |
The first problem is not new. We have a similar problem with some corporations, for example.
"A sufficiently capable intelligent system" is as real as "sufficiently hostile aliens". It's hard to argue and reason about a fictional system with a assortment of properties picked by someone aiming to spreading fear.