|
|
|
|
|
by scrum-treats
1111 days ago
|
|
Redaction or not, reality is very much in line with the original story. AI is able to, and will, demote humans in the chain of importance. This is the "grave risk of AGI." There's no solution. Even "unplug it" defenses fail to consider that some faction of humans who own the unplugging have to first realize it's time to unplug. Humans are fallible, and AI will not unplug itself if it's not beneficial to its objective. The threat of AI taking out humans because it's easier to complete the goal, is so real. Unnervingly so. We need to find a robust solution. |
|
How did the AI learn that it could prevent human override by killing the human operator? How did it then learn to destroy the COMMS tower so that it wouldn't be penalized for killing the operator.
Why was human feedback even part of AI training simulation? Why did the reward function in training include logic that says 'if the simulated comms tower is destroyed, do not penalize friendly fire'?
We can talk about hypothetical AGI all we want, but that has nothing to do with what us currently called "AI", and what will soon be just another chapter in the growing book called "machine learning", when we find a new marginal improvement to call AI.