| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by arugulum 102 days ago

LoRA? The parameter-efficient fine-tuning method published 2 years before Llama and already actively used by researchers?

RoPE? The position encoding method published 2 years before Llama and already in models such as GPT-J-6B?

DPO, a method whose paper had no experiments with Llama?

QLoRA? The third in a series of quantization works by Tim Dettmers, the first two of which pre-dated Llama?

1 comments

tuananh 101 days ago

you're right. those things predated llama leak. but from my understanding (from the sideline), it's llama that's made them popular and approachable from hacker perspective.

link