The really nice thing about this is that the AI can now acquire these newly-decoded texts as part of its training set, and begin learning at a geometric rate.
With our current methods, feeding back even fairly small amounts of outputs back in as training data leads to declining performance.
Just think of it abstractly. The AI will be trained on the errors the previous generation made. As long as it keeps making new errors each generation, they will tend to multiply.
Degradation of autoregressive models being fed their own unfiltered output is pretty obvious: it's, basically, noise being injected into the ground truth probability distribution.
But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.
Why not just feed it random data? It's so smart that it will figure out which parts are random, so eventually you will generate some good data randomly, and it will feed on it, and become exponentially smarter exponentially fast.
Just think of it abstractly. The AI will be trained on the errors the previous generation made. As long as it keeps making new errors each generation, they will tend to multiply.