| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by touringa 1289 days ago
	The human feedback fine-tuning concept was applied following strict policies and rules. The rules chosen by OpenAI would be very similar to those applied by DeepMind for the Sparrow dialogue model (Sep/2022), which is a fine-tuned version of DeepMind’s Chinchilla model. The rules used for DeepMind Sparrow were selected by researchers from DeepMind (Alphabet), California Institute of Technology, University of Toronto, and University College Dublin.