Hacker News new | ask | show | jobs
by arugulum 102 days ago
LoRA? The parameter-efficient fine-tuning method published 2 years before Llama and already actively used by researchers?

RoPE? The position encoding method published 2 years before Llama and already in models such as GPT-J-6B?

DPO, a method whose paper had no experiments with Llama?

QLoRA? The third in a series of quantization works by Tim Dettmers, the first two of which pre-dated Llama?

1 comments

you're right. those things predated llama leak. but from my understanding (from the sideline), it's llama that's made them popular and approachable from hacker perspective.