| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ozr 769 days ago

I’ve taught LLMs imaginary words and their meanings with minute amounts of data (two or three examples) via full fine-tuning, LoRA and QLoRA.

I have no idea where the myth of ‘can’t add new knowledge via fine-tuning’ came from. It’s a sticky meme that makes no sense.

Pretraining obviously adds knowledge to a model. The difference between pretraining and fine-tuning is the number of tokens and learning rate. That’s it.