|
|
|
|
|
by abcdabcd987
947 days ago
|
|
Thank you! We are also very excited about combining the fast fine-tuning and efficient serving. In fact, what you just said is very related to one of our very first motivations. In my previous blog post [1], I call this scheme "Just-in-time Fine-tuning". Our previous measurement is that, for a medium-sized webpage (~10K tokens), it takes around 30 seconds to 2 minutes to finetune a LoRA model. Another good side of this JIT fine-tuning scheme is that, we can turn any model into a long-context model. We'll keep doing more research on finetuning. And hopefully, we'll see the results soon. [1] https://le.qun.ch/en/blog/2023/09/11/multi-lora-potentials/ |
|