|
|
|
|
|
by sorenjan
852 days ago
|
|
I just read the article, nice write up! I think it would benefit from a short explanation of what the magnitude vector (m) and the directional matrix (V) are, I'm not familiar with that kind of decomposition. Not related to the article but tangentially relevant, would it be possible to train a LoRA or DoRA with a high rank, and then use SVD to see if the rank is too high and truncate to a better value of r? Maybe use different ranks for different layers after some training? |
|
I haven't tried what you were suggesting, but that sounds actually plausible. Interesting idea!