Hacker News new | ask | show | jobs
by sebzim4500 1066 days ago
>Is there a reason I should not believe that an expert model will always outperform a general purpose one, even if it's a metatransformer?

If a general purpose model beats the specialized one, you could almost certainly distill the general purpose one into a better specialized one.