| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by haxton 1018 days ago
	gpt3.5 turbo is (mostly likely) Curie which is (most likely) 6.7b params. So, yeah, makes perfect sense that it can't compete with a 70b model on cost.

5 comments

These sites say 154B:

gpt3.5 turbo is a new model, not Curie. As others have stated, it probably uses Mixture of Experts which lowers inference cost.

Is there a source on that? I've never seen anyone think it's below even 70B

It still does a much better job at translation than llama 2 70b even, at 6.7b params

If it's MOE that may explain why it's faster and better...

MOE?

I thought it was fairly well established that GPT 3.5 has something like 130B parameters and that GPT 4 is on the order of 600-1,000

I remember:

- gpt-3.5 175b params

- gpt-4 1800b params