| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by magicalhippo 109 days ago
	I got the impression it's due to the reinforcement learning step, where it's taught to follow instructions, which rewards the model for such writing.