Hacker News new | ask | show | jobs
by MattRix 34 days ago
It’s not a 3B model, it has 3B active parameters. The full model is much larger.
1 comments

That's true, I should have mentioned active. Actual params are closer to 12B-14B likely, given the 40GB VRAM usage.