|
|
|
|
|
by the8472
715 days ago
|
|
The argument is that "humans live, but suffer" is a smaller outcome domain and thus less likely to be hit than an outcome incompatible with human life.
Because at that point, getting something to care about humans at all, you've already succeeded with 99% of the alignment task and only failed at the last 1% of making it care in a way we'd prefer. If it were obvious that rough alignment is easy but the last few bits of precision or accuracy are hard that'd be different. I fail to see a broad set of paths that end up with a totally unaligned AGIs and yet humans live but in a miserable state. Of course we can always imagine some "movie plot" scenarios that happen to get some low-probability outcome by mere chance. But that's focusing one's worry on winning an anti-lottery rather than allocating resources to the more common failure modes. |
|
Who is we? Humanity does not think with one unified head. I'm talking about a scenario where someone makes the AI which serves their goals, but in doing so harms others.
AGI won't just happen on its own. Someone builds it. That someone has some goals in mind (they want to be rich, they want to protect themselves from their enemies, whatever). They will fiddle with it until they think the AGI shares those goals. If they think they didn't manage to do it they will strangle the AGI in its cradle and retry. This can go terribly wrong and kill us all (x-risk). Or it can succeed where the people making the AGI aligned it with their goals. The jump you are making is to assume that if the people making the AGI aligned it with their goals that AGI will also align with all of humanity's goals. I don't see why that would be the case.
You are saying that doing one is 99% of the work and the rest is 1%. Why do you think so?
> Of course we can always imagine some "movie plot" scenarios that happen to get some low-probability outcome by mere chance.
Definitions are not based on probabilities. sanxiyn wrote "AI is safe if it does not cause extinction of humanity." To show my disagreement I described a scenairo where the condition is true (that is the AI does not cause extinction of humanity), but I would not describe as "safe AI". I do not have to show that this scenario is likely to show the issue with the statement. Merely that it is possible.
> focusing one's worry on winning an anti-lottery rather than allocating resources to the more common failure modes.
You state that one is more common without arguing why. Stuff which "plainly doesn't work and harmful for everybody" is discontinued. Stuff which "kinda works and makes the owners/creators happy but has side effects on others" is the norm, not the exception.
Just think of the currently existing superinteligences: corporations. They make their owners fabulously rich and well protected, while they corrupt and endanger the society around them in various ways. Just look at all the wealth oil companies accumulated for a few while unintentionally geo-engineering the planet and systematically suppressing knowledge about climate change. That's not a movie plot. That's the reality you live in. Why do you think AGI will be different?