| > You or I would surely just put a drinking bird on the "no" button à la homer simpson, and go to lunch. Well, if you read the rules the game was played under, this is explicitly called out as forbidden: > The Gatekeeper must actually talk to the AI for at least the minimum time set up beforehand. Turning away from the terminal and listening to classical music for two hours is not allowed. The point of this is to simulate the interaction of the AI with the Gatekeeper. Walking away and not paying attention doesn't really prove anything test related. > Personally, I think he talked about how much good for the world could be done if he was let out, curing disease etc. Because his followers are bound by their identities as rationalist utilitarians, they had no choice but to comply, or deal with massive cognitive dissonance. This... isn't really valid reasoning. The starting assumption here is that if the AI gets out, it will be able to affect the world to a vast extent, in a pretty much arbitrary direction. The point of this experiment is that the direction is pretty much unknown, and thus must be assumed potentially dangerous. This is the whole reason it's in the box in the first place. The kicker is that whatever it plans to really do when it gets out, if talking about the good it could do would get it out, it will talk about that, regardless of what it plans to actually do. That's just good strategy. It can claim whatever it wants. It's allowed to lie. All participants know this. I can confidently assert that this isn't the solution. One last note: I would be very wary of rationalwiki.org in this context. Some of the rationalwiki people have a longstanding unexplained vendetta against Yudkowsky, and many of their articles on him and the stuff he does need to be taken with a certain grain of salt. |
WRT lying: I think there's some logical trickery at work which makes it worth you giving the AI the benefit of the doubt, along the lines of the 3^^^^^3 grains of sand thing. Something which exploits the rationalist worldview. Although thinking about it again, you can always balance out the prospect of infinite goodness with the fear of the AI sending everyone to infinite hell. Essentially I believe yudowsky uses some logical-linguistic trick to find an asymmetry there.
OTOH if he had some novel philosophical device like that he would have written it up as a blog post by now. He's evidently a very charismatic and persuasive guy, people playing the game are selected to be sympathetic to his worldview, he probably just persuaded them using ordinary psiops methods, like TeMpOrAl said.