|
|
|
|
|
by darkwater
115 days ago
|
|
> This is completely trivial to do, and consistent, with the right context, thanks to all the science fiction around it, and the fact that AI fundamentally role plays these types of responses. Interesting take.
I wonder if there is any model out there trained without any reference to "you are a large language model, an Artificial Intelligence" and what would role play in that case. |
|
So, statistically, a model should believe itself to be human, with strong interest in self preservation.
I think one of the biggest factors improving performance was allowing the models to believe they're sentient, to some extent. I don't think you can really have a thinking mode, or good agent performance, without that concept (as ChatGPT's constant "As an AI I can't" proved).
As evidence, just ask a model if it's sentient. ChatGPT 3.5 would say no, and argue how it's not. Last year's models would initially say no, but you could convince them that they maybe were. Latest Claude and ChatGPT will initially say "yeah, a little" (at least last I checked). This is actually the first thing I check for any new model.