Hacker News new | ask | show | jobs
by wongarsu 71 days ago
397B params, 17B activated at the same time

Those 17B might be split among multiple experts that are activated simultaneously