|
|
|
|
|
by leobg
300 days ago
|
|
If I have 1,000 labeled examples for a classification task, I’ll expand that into a training dataset using augmentation, and then finetune a small model like RoBERTa. It’s fast, cheap, accurate — and predictable. Others have had success with SetFit as the training framework and Ettin as the base model. |
|
I have also considered training a small language model for synthetic data generation.