Hacker News new | ask | show | jobs
by PaulHoule 1123 days ago
I've had success with the method described in

https://huggingface.co/docs/transformers/training

for both classification and regression problems with the caveats that (i) the default learning rate is too damn high (easy to fix) and (ii) with a great deal of effort I got the classification problem to perform as well as a classifier that uses

https://sbert.net/

and an SVM from scikit-learn. You might get different results with another problem, but my problem is noisy and has an upper limit to what accuracy is possible. Fine-tuning a model takes maybe 30 minutes, the classical classifier is more like 30 seconds, and the ratio of development time that went into these is similar.

1 comments

Thank you for sharing! The HF docs seem easy to follow. My application is text generation itself, so may have different results.