Hacker News new | ask | show | jobs
by ozr 811 days ago
This is not correct. Fine-tuning can absolutely add new knowledge to a model. It's been repeatedly demonstrated at this point.

LIMA demonstrated that instruction-tuning and output formatting could be trained with a limited number of samples, not that finetuning was incapable of adding new information to the model.

It may be sub-optimal in most cases to RAG, but it does work.

1 comments

Do you have any good links to support the idea that this has been repeatedly demonstrated?

I've had trouble finding high quality sources of information about successful applications of fine-tuning to add knowledge to a model.

Here is a recent HN discussion of an article that talks about this. https://news.ycombinator.com/item?id=39748537

Anecdotally, I literally "added knowledge" to a model via fine-tuning earlier today.

Fine tuning can do extremely well given a specific question and answer, the tuned model "knows" how to answer that question much more accurately.

I gave it a specific question, and a good answer as a fine tuning input. (Literally 2 data points as the input, 2 questions/answer sets.)

I asked it that question, and the tuned model blows the base model away, for answering that specific question.

> I asked it that question, and the tuned model blows the base model away, for answering that specific question.

Validating on training data...What could possibly go wrong?

This thread reminds of a competition I once joined where we were supposed to fine-tune an LLM to fill out trivia answers, and we were expressly disallowed from training on the validation set.

However: we were allowed to pick any base model in a given repo. All of the teams that “won” did so for the same reason: they had all picked the same base model (whereas a majority of teams picked the given default), presumably the one that had at some point been trained on the most favorable data for this particular challenge.

It was quite silly. Had everyone had the same base model we’d have a bit more of an interesting problem (more around NLP and alignment than picking the ‘best’ model).

Well, in this case we're literally asking if the model can remember new facts, not generalize, so seems like a legit first level test; second level might be, can it answer a question incorporating that specific knowledge in a broader question.