| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Majromax 1163 days ago
	Highest posssible in which combination, though? If you’re fine tuning a model with N layers, then you could apply LoRA to any or all of them. Maybe it’s better to concentrate effort unevenly, in which case a uniform increase of adaptation rank (to compute budget) could still be subpar.

1 comments

Right but the way that this paper proposes determining the best rank is by training a LoRA with the full rank.