|
|
|
|
|
by cheald
261 days ago
|
|
Standard LoRA (W_delta = B@A with standard inits) generally underperforms FT, primarily because of "intruder dimensions" (new high-ranking singular vectors which misalign with the singular vectors of the underlying weights) as outlined in the paper. There are techniques like PiCa and SVFT which can mitigate much of the loss, though. |
|