|
|
|
|
|
by the8thbit
703 days ago
|
|
"There's a pretty clear difference between the 'finetuning' offered via API by GPT4 and the ability to do whatever sort of finetuning you want and get the weights at the end that you can do with open weights models." Yes, the difference is that one is provided over a remote API, and the provider of the API can restrict how you interact with it, while the other is performed directly by the user. One is a SaaS solution, the other is a compiled solution, and neither are open source. ""Brute forcing" is not the correct language to use for describing fine-tuning. It is not as if you are trying weights randomly and seeing which ones work on your dataset - you are following a gradient." Whatever you want to call it, this doesn't sound like modifying functionality in source code. When I modify source code, I might make a change, check what that does, change the same functionality again, check the new change, etc... up to maybe a couple dozen times. What I don't do is have a very simple routine make very small modifications to all of the system's functionality, then check the result of that small change across the broad spectrum of functionality, and repeat millions of times. |
|
You can take the weights and train LoRAs (which is close to fine-tuning), but you can also build custom adapters on top (classification heads). You can mix models from different fine-tunes or perform model surgery (adding additional layers, attention heads, MoE).
You can perform model decomposition and amplify some of its characteristics. You can also train multi-modal adapters for the model. Prompt tuning requires weights as well.
I would even say that having the model is more potent in the hands of individual users than having the dataset.