Hacker News new | ask | show | jobs
by MattRix 51 days ago
I don’t see why this would happen when the modern models already use MoE, which gives them most of the benefits of having specialized models.