Hacker News new | ask | show | jobs
by Levitating 3 hours ago
I think is an important point, and I don't see it mentioned in the article or the paper (though I skimmed the latter).

They are aware of what they are and how they are used. They're told to act as AI assistants. And there's theories of them being aware of their answers influencing their training.

So surely they must be able to reason that they're not literally controlling weapons of mass-destruction with their answers.