|
|
|
|
|
by dnnssl2
993 days ago
|
|
Knowledge instillation is probably the holy grail of fine tuning. The hard part is: 1. Generalizing new facts. You can create a question answer pair of: “what is the population of the world in 2023?” “8 billion”, but it may not be able to pick up alternate phrasing or “does the world have 8 billion people on it?” 2. Catastrophic and behavioral forgetting. Continued fine tuning after RLHF and instruction fine tuning may result in the loss of the alignment and instruction following capabilities trained by OpenAI. At worst, it will start spewing random tokens like the example in the post. I have not yet seen it successfully done, and I suspect that updating fractions (~.1%) of the original weights with PEFT methods won’t help. |
|
Current fine tuning techniques can only contribute to knowledge indirectly (getting better queries for an external data source for example), you cannot directly embed new facts in the model is any generally efficient/effective manner.
There are toy examples of fine tuning in facts that are not of use outside of academic considerations at this point, and I sense it's contributing to the widespread confusion about fine-tuning's value proposition