Hacker News new | ask | show | jobs
by new_guy 2660 days ago
Simply, by it doing what you tell it but not the way you expect.

It's a little contrived but you tell it 'solve world hunger', so it 'does a Thanos' and wipes out half the human population by releasing a pathogen or something, so it's fulfilled it's primary function but (hopefully) not in the way you expected.

1 comments

If you were to define a clear goal which the AI strives for, wouldn't it be possible to define other goals along with it such as "Never ever hurt humans"?
wouldn't it be possible to define other goals along with it such as "Never ever hurt humans"?

I'm not even close to being an AI alarmist, and I'm skeptical of a lot of Nick Bostrom's arguments. But he does do a pretty good job of articulating the problem with this scenario in his book Superintelligence. He makes a good case that it would be very difficult to articulate such values for the AI. If you're interested in this topic in the general sense, I'd suggest reading the book. I don't think it's perfect, but I will acknowledge that he makes some good points.

That's not a clear goal. For example, define "hurt". People define hurt in all kinds of ways, and sometimes differently when it's themselves or someone else.

Then there's the problem that humans hurt other humans. Should the AI stop that? It's going to have to hurt humans to do it. But if it doesn't, that will hurt other humans...

Yeah, if it were easy codify an objectively correct non-contradictory and universally applicable moral framework, we would have done so in the last eight or so thousand years since we started caring about such things. There are reasons human law is incredibly complex and purposely vague.

Isaac Asimov made a career out of pointing out the hubris of it. Three simple laws, what can go wrong?