Hacker News new | ask | show | jobs
by FBT 3986 days ago
I think you're rather fixated on a certain conception of "rationality" which is more like Mr. Spock than like what Yudkowsky uses it to mean.

The Yudkowskyian definition of rationality is that which wins, for the relevant definition of "win".

Specifically, if there is some clever argument that makes perfect sense that tells you to destroy the world, you still shouldn't destroy the world immediately, if the world existing is something you value. It's a meta-level up: you being unable to think of a counter argument isn't proof, and the destruction of the world isn't something to gamble with.

Yes, Yudkowsky likes thought experiments dealing with the edge cases. Yes, 3^^^^^3 grains of sand is a thought experiment that produces conflicting intuitions. Yes, the edge cases need to be explored. But in a life or death situation (and the destruction of the world qualifies as this 7 billion times over), you don't make your decisions on the basis of trippy thought experiments. (Especially novel ones you've just been presented with. And ones that have been presented by an agent which has good reasons to try to trick you.)

So, no. Again, a "logical-linguistic trick" might work on Mr. Spock, but we're not talking about Mr. Spock here.

> He's evidently a very charismatic and persuasive guy

Exactly. That's the point. If even a normal charismatic and persuasive guy can convince people to let him out, superintelligent AI would have an even easier time at it.

Long story short, it dosn't matter how he did it. All that matters is that it can be done. It can be done even by a "mere" human. If he can do it, a superintelligence with all of humanity's collected knowledge of psychology and cognitive science could do it to, and likely in a fraction of the time.

1 comments

You're right that I've been unfairly dismissive of him, and made my objections somewhat too bluntly. At least it's fostered a discussion.

However, let me be clear: how he did it is the only thing I care about. I am not convinced that the threat of superintelligence merits our resources compared to other concrete problems. To me the experiment is not meaninguflly different to stories of the temptation of christ in the desert. Except more fun than that story, because yudowsky is a more interesting character than satan.

EDIT: if rationality is about winning, what could be simpler than a game where you just keep repeating the same word in order to win? It seems like almost the base-case for rationality, if one accepts that definition.

I would submit that an unstated definition of rationality is "dealing with difficult, complex situations in ones life algorithmically" ie. most of HPMOR, the large amounts of self-help stuff on LR. Someone who had internalized this stuff would be more vulnerable than the average population to "spock-style bullshit", to reuse that unfortunate phrase.

Well, then let's see what we can agree on. I hope that you can agree that if one was to consider superintelligence a serious threat which needs dealing with, then AI boxing isn't the way to go in dealing with it?

That's what he was trying to show in all this, and I think that the point is made. How seriously to take superintelligent AIs is a different issue that he talks about elsewhere, and should be dealt with separately. But if you or someone els were to try to deal with it seriously, I'm pretty sure that you'd agree with me that the way to go about it isn't just boxing the AI and thinking that solves everything, right?

Oh yes, I agree with that premise. It's hard to disagree with. Milgram, the art of Sales plus the aforementioned Derren Brown and his many layers of deception are enough to make the point.

I suppose it's unfortunate that he came up with such an amazingly provocative way of demonstrating his argument, it's somewhat eclipsed the argument itself. I am definitely a victim of nerd sniping here. It must be the open-ended secrecy that does it.