| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yujian 947 days ago
	I'm not sure if I'm missing something from the paper, but are multi-billion parameter models getting called "small" language models now? And when did this paradigm shift happen?

4 comments

hmottestad 947 days ago

All the llama models, including the 70B one can run on consumer hardware. You might be able to fit GPT-3 (175B) at Q4 or Q3 on a Mac Studio, but that's probably the limit for consumer hardware. At 4-bit a 7B model requires some 4GB of ram, so that should probably be possible to run on a phone, just not very fast.

link

sa-code 947 days ago

Gpt 3.5 turbo is 20B

link

kristianp 947 days ago

I doubt that. What's your source?

link

sa-code 947 days ago

There was a paper published by Microsoft that seemed to leak this detail. I'm on mobile right now and don't have a link but it should be searchable

link

nl 946 days ago

The paper was https://arxiv.org/abs/2310.17680

It has been withdrawn with this note:

> Contains inappropriately sourced conjecture of OpenAI's ChatGPT parameter count from this http URL, a citation which was omitted. The authors do not have direct knowledge or verification of this information, and relied solely on this article, which may lead to public confusion

(the noted URL is a just a Forbes blogger with no special qualifications that would make what he claimed particularly credible).

link

Chabsff 947 days ago

Nowadays, small essentially means realistically useable on prosumer hardware.

link

moffkalast 947 days ago

When 175B, 300B, 1.8T models are considered large, 7B is considered small.

link

nathanfig 947 days ago

Relative term. In the world of LLMs, 7b is small.

link