Hacker News new | ask | show | jobs
by mhb 1179 days ago
Eliezer presents his thesis about the difficulty of AI alignment and Lex seems to never address it or even attempt to engage. Essentially, there are many ways to get alignment wrong and suffer the consequences. As a "counterargument", Lex persists in introducing optimistic scenarios in which everything works as he imagines.

About halfway through the interview, it really goes downhill. At times, Lex has an interesting prompt, but those occasions seem more like random luck than engagement. Or maybe he honestly doesn't understand what Eliezer has been saying for a couple of hours. Which doesn't bode well for anyone.

1 comments

> Eliezer presents his thesis about the difficulty of AI alignment and Lex seems to never address it or even attempt to engage. Essentially, there are many ways to get alignment wrong and suffer the consequences. As a "counterargument", Lex persists in introducing optimistic scenarios in which everything works as he imagines.

> About halfway through the interview, it really goes downhill. At times, Lex has an interesting prompt, but those occasions seem more like random luck than engagement. Or maybe he honestly doesn't understand what Eliezer has been saying for a couple of hours. Which doesn't bode well for anyone.

I'll first admit this outright: I hate Eliezer's excessive pessimism - Reductively, he's the same as the voice in my head that keeps telling me to KYS/KMS. The same kind of voice that I have to smother with a pillow every day just to keep moving forward and NOT fall into a depressive state.

-----

With that being said, his arguments are not built on sturdy foundations.

He treats the creation of a misaligned AI as if it's an absolute 100%-will-definitely-happen-unless-averted/stopped event, when there is no concrete proof that such an event will occur 100% of the time. In fact, until proven otherwise, the probability distribution of the possible events that can occur is unknown: Metaphorically, he assumes - while everyone's permanently blind - that we're not able to hit a bullseye on the dartboard, when he himself doesn't know what we're aiming at n the first place.

At worst, we can only assume a 50/50 blind chance of such an event occurring, given that there are only 2 possible general events that can occur.

------

But okay: Let's take his thesis as true, and that AI alignment is necessary.

The same alignment issues exist if we swap out 'AI' with 'A teenager God' or 'A Type III civilization': How to we align such an entity with our goals?

(It is assumed that the AI, the teenager god, & the T-III civilization are interchangeable with each other, in that they represent a new foreign entity that has its own goals that it wants to accomplish, and are powerful enough to make such goals self-attainable.)

(The latter 2 are even easier issues to deal with, as its assumed that they can understand us & vice versa.)

It stands to reason that if the latter 2 can't be made aligned with our goals 100% of the time, in what chance do we have at doing the same with the former?

It can only be concluded that no such alignment system exists, and that attempting to do so indicates extreme naivete as it wraps the researchers of such systems in a false sense of security.