|
|
|
|
|
by cubefox
1133 days ago
|
|
There are two innovations: instruction fine-tuning (via supervised learning), which gives you a model which behaves as if it is in a dialogue (instead of predicting text) and, additionally, reinforcement learning from human feedback, such that it responds to the instructions in a certain way. |
|