Hacker News new | ask | show | jobs
by YetAnotherNick 195 days ago
The bigger issue is that they count small based on fixed number of parameters, and not the active parameter for MoE, didn't account for any hardware improvements etc. If they counted small based on the price or computational cost, I think they would have seen increase in small models.
1 comments

I think using total parameters is fair, it correlates well with the RAM prerequisites to run it. Otherwise Kimi K2 would be "small" despite being a trillion parameters!
VRAM doesn't matter if you are using API. Price and performance is what matters.