| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rish-b 1093 days ago
	A common reason is to reduce cost and latency. Larger models typically require GPUs with more memory (and hence higher costs), plus the time to serve requests is also higher (more matrix multiplications to be done).

1 comments

andreygrehov 1093 days ago

Got it. That makes sense. Thank you. But what about the quality then? Can the quality of 13B model be the same as the quality of, say, 30B model?

link

rolisz 1093 days ago

Flan-T5 is a 3B model that is of comparable quality to Llama 13B.

Moreover, you can fine-tune model for your specific tasks and you need fewer resources to fine tune a smaller model.

link

spacebanana7 1093 days ago

As a general principle the larger models are better quality.

However, fine tuned small models can outperform general purpose large models on specific tasks.

There are also many lightweight tasks like basic sentiment analysis where the correctness of small models can be good enough to point of being indistinguishable from large models.

link