Hacker News new | ask | show | jobs
by neodymiumphish 847 days ago
Which still rounds to 9B and is 21.4% larger.
1 comments

Yes, it's definitely unfair to count it as a 7B model. In that case, we could call Llama 2, which is 6.6B parameters, a 6B (or even 5B) parameter model.
Except 6.6 rounds to 7. That’s completely reasonable. Arguing otherwise is pedantic.