Y
Hacker News
new
|
ask
|
show
|
jobs
by
maccam912
536 days ago
Is there any rule of thumb for small language models vs large language models? I've seen phi 4 called a small language model but with 14 billion parameters, it's larger than some large language models.
3 comments
ekianjo
536 days ago
7b to 9b is usually what we call small. the rule of thumb is a model that you can run on a single GPU.
link
exitb
536 days ago
It’s not a useful distinction. The first LLMs had less than 1 billion parameters anyway.
link
kittikitti
536 days ago
I would claim that even 500 million parameters could be considered large.
link