Hacker News new | ask | show | jobs
by cshimmin 66 days ago
The 6 is part of 3.6, the model version. 35B parameters, A3B means it's a mixture of experts model with only 3B parameters active in any forward pass.
1 comments

Got it. Thanks