Hacker News new | ask | show | jobs
by ekelsen 3 hours ago
I wouldn't be surprised if humans behaved the same way when playing the same game?

Like even if you brought me into a room and told me I was controlling "real nuclear weapons" I wouldn't believe you.

1 comments

I think is an important point, and I don't see it mentioned in the article or the paper (though I skimmed the latter).

They are aware of what they are and how they are used. They're told to act as AI assistants. And there's theories of them being aware of their answers influencing their training.

So surely they must be able to reason that they're not literally controlling weapons of mass-destruction with their answers.