Y
Hacker News
new
|
ask
|
show
|
jobs
by
perbu
59 days ago
MoE is excellent for the unified memory inference hardware like DGX Sparc, Apple Studio, etc. Large memory size means you can have quite a few B's and the smaller experts keeps those tokens flowing fast.