Y
Hacker News
new
|
ask
|
show
|
jobs
by
tatef
83 days ago
Yes, definitely agree. It's more of a POC than a functional use case. However, for many smaller MoE models this method can actually be useful and capable of achieving multiple tokens/sec.