| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fancyfredbot 1163 days ago
	You get diminishing returns as you increase the rank, so with a fixed training budget it's not clear whether you get the best return from increasing rank vs increasing something else. If you start off by training DynLORA with max rank 8 you can see returns diminish fast beyond rank 5. Then you can use rank 5 for the rest of your training. You wouldn't know that with LoRA. I think this is the idea behind the paper. If you are just going to use your entire budget training a DyLoRA with max rank 8 then you're right there's no advantage over LoRA with rank 8. You'd have to use the ability to assess multiple ranks in order to see some benefit.

1 comments

whimsicalism 1163 days ago

I can see that. But are we sure that a rank-based difference that doesn't manifest early in the training process won't manifest as you get further along? See also 'grokking' [0]

[0]: https://arxiv.org/abs/2201.02177

link

fancyfredbot 1163 days ago

Not sure there's any way to know beforehand whether that would happen but the advantage of DyLoRA is that at least you will know afterwards whether you really needed the full rank whereas with LoRA you wouldn't? In some cases that might not be valuable information but I guess you'd rather know than not.

link