Hacker News new | ask | show | jobs
by red75prime 536 days ago
Degradation of autoregressive models being fed their own unfiltered output is pretty obvious: it's, basically, noise being injected into the ground truth probability distribution.

But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.