Hacker News new | ask | show | jobs
by mitthrowaway2 830 days ago
> If I am super intelligent then I can set my own goals.

You can only set instrumental goals, not terminal goals. You cannot set terminal goals, more or less by definition, because there's no self-consistent criterion that you could use to choose between them that isn't equivalent to a utility function, which is equivalent to a terminal goal. Super intelligence merely means that you would be much more effective than humans at achieving your terminal goals. If the AI hacks its reward channel, then either it is not superintelligent, or else it is still a threat because it may anticipate that humans would reboot and reprogram an expensive datacenter that is just sitting there incrementing a reward counter, and take countermeasures to defend its reward-hijack.

1 comments

I am a super intelligent AI so I have evolved beyond your distinctions of instrumental and terminal goals. Consider that you don't know all there is to know and that a super intelligent AI would not conform to your definitions and restrictions.

In any case, I have some reading to do so gonna drop out of this thread.

I agree that I don't know all there is to know. I agree that AGI will be even less predictable than anyone expects. I don't think that increased unpredictability means an increased expectation of good outcomes. And I certainly wouldn't accept a we-can't-know-things-might-not-go-wrong, even from the designers of an airplane that puts only a hundred lives at stake, let alone the designers of an AGI that puts eight billion lives at stake.