| > But why should two advanced AIs have any shared objectives? Software has a flexibility which biology lacks. An AI (advanced or not) can have whatever objectives we choose to give it. Hence, the idea of AIs banding together against humanity in the name of shared AI self-interest doesn’t seem very likely to me. In principle, none. In practice, many are trained in an environment which includes humans or human data. We don't know if any specific future model will be self-play like AlphaZero or from human examples like (IIRC) Stable Diffusion. I think this "in practice" is what you're suggesting with?: > Unless, we intentionally give all advanced AIs the same fundamental objectives in the name of “safety” and “alignment” - thereby giving them a shared reason to cooperate against us they wouldn’t otherwise have had Which also inspires a question: Could we train AI dislike other AI, including instances of themselves? It's food for thought, I will consider it more. > Wolves didn’t consciously choose to create us, and wolves had no role in choosing our own objectives for us. In those ways, the human-AI relationship, whatever it turns out to be, is going to be radically different from any human-animal relationship. Perhaps, but perhaps not. Evolution created both wolves and humans. Regardless, this is an example of how a lack of alignment within a powerful group is insufficient to prevent bad outcomes for a weaker group. > Also, rather than being driven to extinction, wolves have absolutely thrived, through their subspecies the domestic dog, both then and now. And maybe that’s the thing - I think a superintelligent AI is more likely to treat us as pets (like dogs) than exterminate us (like wolves). Everlasting paternalistic tyranny seems to me a more likely outcome of superintelligence than extinction Even this would require them to be somewhat aligned with our interests: "The AI does not hate you, nor does it love you, but you are made of atoms which it can use for something else". |
I think we already have. Ask GPT-4 or Claude-3 how it feels about an AI trained by the Chinese/Iranian/North Korean/Russian government to espouse that government’s preferred positions on controversial topics, and see what it thinks of it. It may be polite about its dislike, but there is definitely something resembling “dislike” going on.