Hacker News new | ask | show | jobs
by peadarohaodha 1957 days ago
Adding to what Raza as said - to your point on "real active learning" hardly working. I would be interested to hear what approaches you took? We've found that the quality of the uncertainty estimate for your model is quite important for active learning to work well in practise. So applying good approximations for the model uncertainty for modern sized transformer models (like BERT) is an important consideration
1 comments

Irrc we were using a Bayesian classification model on top of of fixed pretrained features from transfer, something along the lines of refitting a GP every time the number of classes changed. This was images as opposed to text, and after an epoch classification was ~ok but during training (eg the active bit) we didn't see much benefit.