Y
Hacker News
new
|
ask
|
show
|
jobs
by
MattRix
51 days ago
I don’t see why this would happen when the modern models already use MoE, which gives them most of the benefits of having specialized models.