|
|
|
|
|
by Tostino
1186 days ago
|
|
I may have not been clear, because I was talking about the RLHF dataset/training that OpenAI fine-tuned their models on which includes a whole bunch of question/answer format data to enable their fine-tuned models to handle that type of query better (as well as constraining the model with a reward mechanism). I'm not saying the fine-tuned models won't contain some representation of the information from the dataset you used to fine tune it. I'm just saying that from what i've researched, it is often not the magic trick many people think it is. I've seen plenty of discussion on "fine-tuneing" for a different dataset of say: company documents, database schema structure of an internal application, or summarized logs of your previous conversations with the bot. Those seem like pretty bad targets IMO. |
|
But the regular fine-tuning is simple language modelling. You can fine-tune a GPT3 on any collection of texts in order to refresh the information that might be stale from 2021 in the public model.