|
|
|
|
|
by creer
913 days ago
|
|
My point is mostly the agenda. I can see a machine having an agenda - even if that agenda is not human or not even understandable. You can call it reward function but that's giving a lot of credit to programmers - which most likely are too far removed from the agenda. Is the machine just answering questions? Well no. If it has cycles to talk to itself (or to two buddies) in the course or pursuing scientific research then perhaps this becomes the agenda (to the expense of other things). That's part of the point: IF the machine develops an agenda then what? But "knowing best" could be a problem anyway. And I expect that if we spend a few more minutes we can think of other ways for the situation to go "oops". Oh here is one: two humans / human entities conflicting on giving instructions. Machine soon enough "on its own". So that I don't think "more specific alignment" can cut it - if we posit a super-human AGI with ways to act on the world. It would have to be more fundamental. Because of the issue that - at some point - one oops is not recoverable. Three laws or something? Heh. |
|