|
|
|
|
|
by xkgt
1138 days ago
|
|
Correct me if I am wrong, to use LORA fine-tuned model in inference you would still need the original model + trained additional layers, right? If we can perfect methods to fine-tune large models for specific task while reducing the overall model size, then it can fit into more consumer grade hardware for inference and can be broadly used. The objective is to prune unnecessary trivia and memorization artifacts from the model and leverage LLMs purely for interpreting natural language inputs. |
|
You don't need additional layers. After training, the product of the two matrices is added to the original weights matrix, so the model size remains the same as the original during inference.