Hacker News new | ask | show | jobs
by leobg 1179 days ago
Thanks for the clarification. Does that mean then that when parallelization is not important, training an adapter might be just as good as or better than LoRA?
1 comments

If latency is irrelevant, I don't think there is a strong practical reason to prefer one over another. (LoRA is more elegant in my biased opinion because you roughly recover finetuning with a large r.) In practice, you see one do a little better on some tasks and vice versa on others as observed by papers after mine.