https://pile.eleuther.ai/
https://arxiv.org/abs/2101.00027
I'm bullish on domain specific models that start from generalized models. Something of a T shape analogy, but maybe a couple of distillation & fine-tuning steps