How RLHF Preference Model Tuning Works (and How Things May Go Wrong)

Y	Hacker News new \| ask \| show \| jobs

	How RLHF Preference Model Tuning Works (and How Things May Go Wrong) (assemblyai.com)
	3 points by mr-ai 1055 days ago