|
|
|
|
|
by Majromax
1163 days ago
|
|
Highest posssible in which combination, though? If you’re fine tuning a model with N layers, then you could apply LoRA to any or all of them. Maybe it’s better to concentrate effort unevenly, in which case a uniform increase of adaptation rank (to compute budget) could still be subpar. |
|