|
|
|
|
|
by jessestcharles
2745 days ago
|
|
I think in this case the alignment between unlabeled domain data and the language task supports a convergence in language task performance. One argument to continue to prefer the ULM+domain model is that it is likely more generally capable if you remain in same domain but switch to a task that is less directly related to your unlabeled data. I haven’t seen any research that directly speaks to that intuition so it’s a good area for further study. |
|