|
|
|
|
|
by rasbt
845 days ago
|
|
Not sure, but in general, it looks like ZipLoRA is only useful in specific contexts like when you have two different tasks you want to optimize for (like style and content in a vision context). DoRA is more general, it's basically normalizing and scaling the LoRA matrices to get much better performance. According to the paper, it even works great for low ranks, which also effectively makes it even more parameter-efficient than OG LoRA. |
|
Not related to the article but tangentially relevant, would it be possible to train a LoRA or DoRA with a high rank, and then use SVD to see if the rank is too high and truncate to a better value of r? Maybe use different ranks for different layers after some training?