|
|
|
|
|
by haldujai
1262 days ago
|
|
If you specifically mean a general LLM trained on a general language corpus with instruction finetuning this is correct. Fortunately very few real world use cases need to be this general. If you are training a LLM on a domain specific corpus or finetuning on specific downstream tasks even relatively tiny models at 330m params are definitely useful and not “toys” and can be used to accurately perform tasks such as semantic text search, document summarization and named entity recognition. |
|
Yes, thanks, that's what I meant.
> If you are training a LLM on a domain specific corpus or finetuning on specific downstream tasks even relatively tiny models at 330m params are definitely useful and not “toys” and can be used to accurately perform tasks such as semantic text search, document summarization and named entity recognition.
Agree, BERT family is a good example here.