Hacker News new | ask | show | jobs
by zqiu 26 days ago
Hi, thank you so much :) Yes, you are very sharp, to make truely it scalable, we have to finetune quantized LLMs directly, similar ideas as QLoRA. The algorithmic innovation is based on our research about orthogonal finetuning OFT and QOFT, which has demonstrated better training stability when finetuning quantized base models in such a scale.
1 comments

Thank you for the information, creating this and releasing the research behind it. Truly appreciate.