Hacker News new | ask | show | jobs
by touringa 1289 days ago
The human feedback fine-tuning concept was applied following strict policies and rules. The rules chosen by OpenAI would be very similar to those applied by DeepMind for the Sparrow dialogue model (Sep/2022), which is a fine-tuned version of DeepMind’s Chinchilla model.

The rules used for DeepMind Sparrow were selected by researchers from DeepMind (Alphabet), California Institute of Technology, University of Toronto, and University College Dublin.