|
|
|
|
|
by arugulum
102 days ago
|
|
LoRA? The parameter-efficient fine-tuning method published 2 years before Llama and already actively used by researchers? RoPE? The position encoding method published 2 years before Llama and already in models such as GPT-J-6B? DPO, a method whose paper had no experiments with Llama? QLoRA? The third in a series of quantization works by Tim Dettmers, the first two of which pre-dated Llama? |
|