|
|
|
|
|
by ankit219
841 days ago
|
|
More than pretraining data, I think the advantage was ChatGPT and how quickly it grew. Remember it was 3.5, and within a month or two, it generated so many actual q&a pairs with rating, feedback, and production level data of how a model will be used by actual users. Those queries and subsequent RLHF + generating better answers for the questions meant the model would have been improved a lot at the SFT stage. Think this is the reason why Anthropic, Google, and Mistral, all three launched their own chatbots, all providing it to users for free and getting realtime q&a data for them to finetune the models on. Google did it with bard too, but it was so bad that not many used it. |
|
But maybe that was still enough time for them to instruction tune it based on ChatGPT feedback, or at least to focus more of their fine tuning iteration in the areas they learned were strong or weak for 3.5 based on ChatGPT usage?