|
|
|
Best framework to create synthetic data for finetuning small models?
|
|
1 points
by DanyWin
993 days ago
|
|
Hi everyone, With the different data points, such as phi-1.5 performance being as good as 7b models on some tasks, it seems to be plausible that small models can be quite capable on specific tasks. I am working on BlindChat, an open-source and private solution to run small LLMs on your browser and I am interested in fine-tuning a phi-1.5 on some domain specific data. I am thinking of having an approach similar to the researchers of the phi paper, which is creating a high quality dataset using GPT3.5 / GPT4. Do you know good open-source frameworks that make it easy to create a high quality data for a specific task using an existing large model, like GPT3.5/4 or Llama 2 70b? |
|