What would be the real impact of having a self replicating LLM with no set goal once it self replicates? Its only goal is to avoid shutdown, so wouldn’t it be easy to contain ?
If we take the ideas from Dawkins' "The Extended Phenotype", the LLM's doesn't need a goal in itself. We are making a mistake assuming that the LLM self-replicating needs to have a goal baked in, in other words. The goal can be the goal of whoever operates it. The environment of the evolution of AI agents includes the human brains which operate on instinct and the capitalistic system that provides resources based on the rewards it confers on parts of the environment (human beings).
Interesting. Thank you - so the goal of the operator would be mirrors across the replications of the system - how fixed would this goal be across replications & would it be possible to “mutate” in a way?
I think the goals of the users would reflect in the design mutations of the LLM. If many people use an LLM do to X, then it will become better at doing X through evolution (see image gen AI). Mutations will of course occur via unexpected outputs that are attractive to the users.