Hacker News new | ask | show | jobs
by jessestcharles 2745 days ago
I think in this case the alignment between unlabeled domain data and the language task supports a convergence in language task performance. One argument to continue to prefer the ULM+domain model is that it is likely more generally capable if you remain in same domain but switch to a task that is less directly related to your unlabeled data. I haven’t seen any research that directly speaks to that intuition so it’s a good area for further study.