|
|
|
|
|
by amasad
1135 days ago
|
|
Broadly finetuning is any post pretraining training. Most of the time it is used in the context of fitting a more narrow task. In our case, it was the same training objective as the pretraining but meant to be more representative of what Replit users like to code. However, we were surprised by how well it boosted overall performance. Best guess: it's a) novel data and b) the model could take even more training!! |
|
Could you, say, fine-tune the model every week with the latest merges? Every hour?