| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maccam912 536 days ago
	Is there any rule of thumb for small language models vs large language models? I've seen phi 4 called a small language model but with 14 billion parameters, it's larger than some large language models.

3 comments

7b to 9b is usually what we call small. the rule of thumb is a model that you can run on a single GPU.

It’s not a useful distinction. The first LLMs had less than 1 billion parameters anyway.

I would claim that even 500 million parameters could be considered large.