|
|
|
|
|
by patelajay285
595 days ago
|
|
We've been working on a Python framework where one of the use cases is easy distillation from larger models to smaller open-source models and smaller-closed source models (where you don't have to still use / pay for the closed-source API service): https://datadreamer.dev/docs/latest/ Here's an (now slightly outdated) example of OpenAI GPT-4 => OpenAI GPT-3.5: https://datadreamer.dev/docs/latest/pages/get_started/quick_... But you can also do GPT-4 to any model on HuggingFace. Or something like Llama-70B to Llama-1B. For some tasks, this kind of distillation works extremely well given even a few hundred examples of the larger model performing the task. |
|
I'm confused why you are mentioning 3.5 here. The weights aren't public, so you aren't actually running any derivative of GPT-3.5
Or am I mistaken. Can you clarify?