|
|
|
|
|
by red75prime
536 days ago
|
|
Degradation of autoregressive models being fed their own unfiltered output is pretty obvious: it's, basically, noise being injected into the ground truth probability distribution. But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve. |
|