Hacker News new | ask | show | jobs
by lamerose 912 days ago
Is your point that a more intelligent AI would develop a more entangled measure of what is good, requiring more specific alignment to be overcome; by way of analogy, are chefs harder to instruct precisely because of their prior expertise? I guess some chefs are like that, but I think it results from personality issues, not structural ones. I find describing an AI as having its own agenda to be a presumptive personification.
1 comments

My point is mostly the agenda. I can see a machine having an agenda - even if that agenda is not human or not even understandable. You can call it reward function but that's giving a lot of credit to programmers - which most likely are too far removed from the agenda. Is the machine just answering questions? Well no. If it has cycles to talk to itself (or to two buddies) in the course or pursuing scientific research then perhaps this becomes the agenda (to the expense of other things). That's part of the point: IF the machine develops an agenda then what?

But "knowing best" could be a problem anyway.

And I expect that if we spend a few more minutes we can think of other ways for the situation to go "oops". Oh here is one: two humans / human entities conflicting on giving instructions. Machine soon enough "on its own".

So that I don't think "more specific alignment" can cut it - if we posit a super-human AGI with ways to act on the world. It would have to be more fundamental. Because of the issue that - at some point - one oops is not recoverable. Three laws or something? Heh.

Ok, those are some good points about what can go wrong. I still doubt that things are particularly more prone to going wrong in more intelligent systems. Wasn't it early, simplistic systems like Tay that went the furthest off the rails? The problem is that more intelligent AI will be used more ambitiously, so when it does go wrong, the consequences might be more serious than some racist twitter posts.
Right. Hedge fund going global threat? That wasn't purely a machine. But none of this needs to be purely a machine. And it sure got far before people reined it back in.

And I don't know that "more intelligent" is necessary. I can see plenty of mirth coming from an amateur or hacker / techies group or less responsible country (hah!) using whatever commercial offer to bake their own agent. What's harder? The core. Hooking up the core to a wallet, internet access, a robo-signing staff - and working around the fine print of the core vendor - that might be much easier than what OpenAI and friends are trying to do. Do they also create their own reward function and alignment in there? Yes. That's part of the fun. That's the point. Do they get it right? Maybe, maybe not.