Hacker News new | ask | show | jobs
by robertlagrant 1142 days ago
> Hey SeanAnderson, good question! While parameter count is certainly an important factor in model performance, it's not the only one. The RedPajama project is taking a more nuanced approach to understanding what makes a model perform well, and their focus on smaller models like the 3B is a big part of that.

Sure, you may have played with a 7B model in the past, but that doesn't mean there's no use case for a smaller model like the 3B. In fact, having a performant, smaller model is a game changer for a lot of applications that don't require the massive scale of the larger models. Plus, smaller models are generally faster and more accessible, which is always a plus.

It's hard to pick out the actual answer: what is the application that this is good at? What has their "more nuanced" approach to understanding performance increased this model's performance at doing?