| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vladf 1163 days ago
	The optimal rank could differ across layers

1 comments

whimsicalism 1163 days ago

I would be shocked if the "optimal rank" in terms of performance wouldn't be using the maximum rank from the DynLoRA across all layers.

link

vladf 1163 days ago

Err, I suppose trivially, the higher rank terms include the lower-rank subnets, so they dominate in terms of quality.

But if you have some capacity constraint (e.g., memory, I guess?) then you can imagine dynamic rank allocation helping in the case where the maximum rank across all layers isn't within budget.

It's a bit of a stretch though, I agree

link

whimsicalism 1163 days ago

As someone else mentioned [0], the procedure would basically be to train a DyLoRA for an initial few iterations, then do a search among the layers to find the best scoring combination of ranks, and then train pruned to just use those ranks to completion.

Seems complicated but I could see it being useful potentially.

[0]: https://news.ycombinator.com/item?id=35517353

link