|
|
|
|
|
by azeusCC
556 days ago
|
|
I’ve been sticking with PhraseMatcher because it’s simple, fast, and predictable—but your suggestion about using smaller BERT-based models or embeddings like SBERT (sentence-transformers) is intriguing. I’ve avoided LLMs so far because of the computational overhead, but it sounds like even lightweight models can provide significant value. Out of curiosity, when training models like SBERT or even smaller BERT versions, do you see diminishing returns when working with smaller training sets (e.g., a few thousand annotated job descriptions)? My current dataset isn’t huge yet (10k), so I wonder where that line starts to appear. I’ll definitely look more into SBERT and segmentation approaches—thanks for sharing those! |
|
I have also tried fine-tuning BERT models to do the same, it takes at least 30 minutes to make one model (not do all the model selection I do w/ the sk-learn based models) and I never developed a training protocol that reliably did better than my SVM-based model. My impression there was that the small BERT models don't really seem to have a lot of learning capacity and don't seem to really benefit from 5000+ documents but really high accuracy isn't possible with my problem (predict my own fickle judgements, I feel like I am doing great with AUC-ROC 0.78 or so)