Y
Hacker News
new
|
ask
|
show
|
jobs
by
m_w_
2 days ago
I think Mythos is rumored to be ~10T parameters, so in this case I think the answer is yes, although I'm sure MoE, looped models, etc play a role in the improvements as well.