|
|
|
|
|
by reissbaker
656 days ago
|
|
It's totally task dependent. On some tasks, large well-trained models are already great; if the large model already exhibits human-level performance, a small-model finetune is unlikely to beat it. Similarly, on very general tasks (e.g. "coding" as a general task, as opposed to "writing idiomatic NextJS" being a specific task), a small-model finetune will be unlikely to beat a large model. But there are plenty of tasks that even large, well-trained models struggle with. If the OP is struggling to get useful root-cause analysis for cloud service incidents out of an existing large model, that seems exactly like a use case where a finetune would shine. Also, finetunes don't have to be just for small models! Medium-sized models like Llama-3.1-70b can be finetuned, and if you want to burn a lot of GPUs you can finetune 405b as well. |
|