|
|
|
|
|
by kamranjon
373 days ago
|
|
This is a pretty awful take. Everyone understands they are modifying the weights - that is the point. It’s not like these models were released with all of the weights perfectly accounted for and changing them in any way ruins them. The awesome thing about fine-tuning is that the weights are malleable and you have a great base to start from. Also the basic premise that knowledge injection is a bad use-case seems flawed? There are countless open models released by Google that completely fly in the face of this. Medgemma is just Gemma 3 4b fine-tuned on a ton of medical datasets, and it’s measurably better than stock Gemma within the medical domain. Maybe it lost some ability to answer trivia about Minecraft in the process, but isn’t that kinda implied by “fine-tuning” something? Your making it purpose built for a specific domain. |
|