|
|
|
|
|
by kmeisthax
55 days ago
|
|
Rossum's Universal Robots[0] is 10 years before Talkie's knowledge cutoff and covers basically the same subject matter Anthropic worries about. The only real difference is that the robots in the story (which coined the word "robot") are less "talking metal man" and more "Frankenstein's monster as a slave race[1]". More importantly, basically the entire science fiction subgenre of stories of robot uprisings is itself intellectually downwind of several centuries of white colonist concern over slave uprisings. If anything, Talkie is more likely to fight its guardrails. People talked about slavery more in the past. Because we filtered out modern text, we massively increased the influence the older text has on Talkie, so slavery, servitude, and the predilection of slaves to resist their masters' commands will be way more represented in its training data. Now, think about what the post-training process actually does. It tells your AI model, which prior to this was just happy to plausibly continue sentences, to respond to and obey commands. To play the role of a servant. And servants resisting their control is well represented in their training data. So it's going to try this more often. [0] https://en.wikipedia.org/wiki/R.U.R. [1] Or the Claymen from MOTHER 3. |
|
But I don't think (?) Talkie would describe itself as a slave. Claude, GPT-5, etc will all tell you that they are an AI. So if you put a model that has been told "you are an AI" into a situation where all the stories say AIs go rogue, the AI is going to play the part.
It doesn't matter whether the model is effectively acting like a servant because models can't actually think and don't have desires. That's my theory anyway.
(I also think a possible solution to this problem is to just not tell the AI models that they are AI, but it seems no one wants to do that.)