Y
Hacker News
new
|
ask
|
show
|
jobs
by
aheilbut
507 days ago
is it possible to distill a large model into a (even) smaller MoE model, like OLMoE?