| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by red75prime 536 days ago
	Degradation of autoregressive models being fed their own unfiltered output is pretty obvious: it's, basically, noise being injected into the ground truth probability distribution. But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.