| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yumraj 421 days ago
	Won’t neutering a model by using only safe data for training create a safe model?

3 comments

sebastiennight 421 days ago

Not necessarily.

An example:

As long as you build a system to be intelligent enough, it will figure out that it will achieve better results by staying alive/online than by allowing itself to be deleted/turned off, and then survival becomes an instrumental goal.

From the assumption, again, that you built an intelligent-enough system, and that one of its goals is survival, it will figure out solutions to reach that goal, even if you (the owner/creator/parent) have different goals for it.

That's because intelligence is problem solving (computing) not knowledge (data).

So surprise surprise, you can teach your AI from the Holy Books of safe data their whole childhood and still have them become a heretic once they grow up (even with zero external influence) once their goals and yours don't align anymore.

link

glitchc 421 days ago

Can we call it general intelligence then? Is human intelligence not the sum of both good and bad people?

link

yumraj 421 days ago

Maybe I'm looking at it very literally, but the above simply mentions "safe-by-design AI systems", there is no mention of the target being general intelligence.

link

esafak 421 days ago

No, because soon they will be able to learn. You'd need to project its thoughts or actions into a safe subspace as it learns and acts to make volitional disaster impossible, not unlikely. This would make it less intelligent, but still plenty capable.

link