Hacker News new | ask | show | jobs
by vladf 1163 days ago
Err, I suppose trivially, the higher rank terms include the lower-rank subnets, so they dominate in terms of quality.

But if you have some capacity constraint (e.g., memory, I guess?) then you can imagine dynamic rank allocation helping in the case where the maximum rank across all layers isn't within budget.

It's a bit of a stretch though, I agree

1 comments

As someone else mentioned [0], the procedure would basically be to train a DyLoRA for an initial few iterations, then do a search among the layers to find the best scoring combination of ranks, and then train pruned to just use those ranks to completion.

Seems complicated but I could see it being useful potentially.

[0]: https://news.ycombinator.com/item?id=35517353