|
|
|
|
|
by hansvm
1101 days ago
|
|
It's not the first paper on the topic IIRC, but OpenAI's InstructGPT paper [0] is decent and references enough other material to get started. The key idea is that they're able to start with large amounts of relatively garbage unsupervised data (the internet), and use that model to cheaply generate decent amounts of better data (ranking generated content rather than spending the man-hours to actually write good content). The other details aren't too important. [0] https://arxiv.org/abs/2203.02155 |
|