Y
Hacker News
new
|
ask
|
show
|
jobs
Moe inference optimizations: 15% lower expert load by request reordering
(
blog.doubleword.ai
)
3 points
by
mezark
25 days ago