Hacker News new | ask | show | jobs
by sorenjan 852 days ago
I just read the article, nice write up! I think it would benefit from a short explanation of what the magnitude vector (m) and the directional matrix (V) are, I'm not familiar with that kind of decomposition.

Not related to the article but tangentially relevant, would it be possible to train a LoRA or DoRA with a high rank, and then use SVD to see if the rank is too high and truncate to a better value of r? Maybe use different ranks for different layers after some training?

1 comments

Thanks for the feedback. Clarifying definitely wouldn't hurt. Added a paragraph and new figure at the top of the DoRA section: https://magazine.sebastianraschka.com/i/141797214/introducin...

I haven't tried what you were suggesting, but that sounds actually plausible. Interesting idea!