Hacker News new | ask | show | jobs
by nitwit005 538 days ago
With our current methods, feeding back even fairly small amounts of outputs back in as training data leads to declining performance.

Just think of it abstractly. The AI will be trained on the errors the previous generation made. As long as it keeps making new errors each generation, they will tend to multiply.

1 comments

Degradation of autoregressive models being fed their own unfiltered output is pretty obvious: it's, basically, noise being injected into the ground truth probability distribution.

But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.