|
|
|
|
|
by acapybara
1141 days ago
|
|
Hey SeanAnderson, good question! While parameter count is certainly an important factor in model performance, it's not the only one. The RedPajama project is taking a more nuanced approach to understanding what makes a model perform well, and their focus on smaller models like the 3B is a big part of that. Sure, you may have played with a 7B model in the past, but that doesn't mean there's no use case for a smaller model like the 3B. In fact, having a performant, smaller model is a game changer for a lot of applications that don't require the massive scale of the larger models. Plus, smaller models are generally faster and more accessible, which is always a plus. |
|
So we are all in agreement here that a 3B model is fundamentally inferior to a larger model?
Not that it doesn’t have uses; not that there’s no value in research in small models.
Just, honestly, that these smaller models don’t have the capabilities of the larger models.
It’d be good to be a direct acknowledgment of that, because it seems like you’re going out of your way to promote the “it’s fine to have a small model”; and it is, roughly speaking. Parameter count isn’t everything. Small models are accessible, you can easily fine tune them. They are interesting.
…but, they are not as good, as far as I’m aware, in terms of output, in terms of general purpose function, as larger models.