| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jacquesthibs 1200 days ago
	I think you’re a bit confused. Alignment researchers are the ones saying it will likely be impossible to box a superintelligent AI and that this is one small part as to why this whole AI alignment thing is hard in the first place. They are precisely working on the problem because there seems to be no easy solution like boxing the AI. And so they need to find a way to make sure the AI is aligned with our values even if the AI has the capability of escaping any box we put it in. In reality, companies will not even try to box the AI so it’s probably irrelevant.

1 comments

13years 1200 days ago

If have seen it stated that containment is the insurance for alignment since there is no way to guarantee alignment will either remain in the form we set or will deviate from the path we perceive.

Which is part of the same argument, in that if the goal is AI safety that does not have free agency to do harm either directly or as an indirect unforeseen behavior, then it is an impossibility without containment and if containment is impossible so is alignment.

How could we expect it to be otherwise? At least if you are going to model the intelligence based on human experience and how we process the world then we have to assume just like a human mind can become unaligned so could ASI.

link

jacquesthibs 1200 days ago

Containment is generally just a thing that people say we should at least try to do to buy time, but there’s very little discussion beyond that. No alignment researcher is arguing that containment solves our problems. The solution to alignment needs to work even if we don’t box the AI. The point is that, yes, alignment is indeed hard.

link

13years 1200 days ago

> The point is that, yes, alignment is indeed hard.

An aligned AI is essentially a soft boxed AI. It is not that different in concept. The end result is we want a set of behaviors as outcomes and other behaviors to not be present.

Complex problems are hard. Paradoxes are not solvable. That is the distinct difference I'm trying to make.

link