| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rain1 392 days ago
	The Gemma models are too small to be included in this list. You're right the T5 stuff is very important historically but they're below 11B and I don't have much to say about them. Definitely a very interesting and important set of models though.

2 comments

> too small

Eh?

* Gemma 1 (2024): 2B, 7B

* Gemma 2 (2024): 2B, 9B, 27B

* Gemma 3 (2025): 1B, 4B, 12B, 27B

This is the same range as some Llama models which you do mention.

> important historically

Aren't you trying to give a historical perspective? What's the point of this?

Since you included GPT-2, everything from Google including T5 would qualify for the list I would think.