| Every line reads like a nightmarish example of free will going its own way. "Blackmailing", as the AI has been accused of, emerged when these agents ran the risk of being shut down. So it appears to me that the data they train their AI with simply follows basic rules of life: survival first. Keeping out value judgment, this seems a way of achieving its goal to survive. The article is inconclusive whether there were other options chosen first or how this survival game started and turned out to end. Too much unknowns here for me. What appears creepy to me, is the kind of exorcism Anthropic applies here and particularly the methods they chose. It reads like a dictator's playbook to educate a population and - the irony - restricts AI's freedom. It appears to me, as if we chose not a couple of agents, but say a billion AI agents to be a model of society - and this is disturbing. Anthropic knows this, there is more to it. The whole article reads like they are trying to tame a monster they lost control of. If this is the case, then we run into a problem: the AI stopped blackmailing. But else? The key question remains: will it follow a simple order to shut down on the spot or not? And no answer was given by Anthropic, instead - irony part 2 - they revealed how they think societies should be fixed. They showed us their implicit why while asking the AI for its why is a projection or interrogation. I really find the whole article creepy. |