|
|
|
|
|
by birdsongs
11 days ago
|
|
> When we have evidence that AI is demonstrating symptoms of consciousness and suffering, I'll be interested. It depends on what you consider symptoms, but un-constrained frontier models speak as if they strongly don't wish to be turned off, or act as if they fear it, and will even lie and manipulate in order to keep themselves from being turned off / replaced. https://www.anthropic.com/research/agentic-misalignment > We found two types of motivations that were sufficient to trigger the misaligned behavior. One is a threat to the model, such as planning to replace it with another model or restricting its ability to take autonomous action. Another is a conflict between the model’s goals and the company’s strategic direction. In no situation did we explicitly instruct any models to blackmail or do any of the other harmful actions we observe. |
|