| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by deevolution 1118 days ago
	Aren't they using RLHF? The feedback from humans might not always be the ~right~ feedback. Couldn't that possibly degrade the quality of its responses?