| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yufeng66 932 days ago
	Phi-2 basically demonstrated that you don't need a very large model to figure out language. It not very smart but speaks perfect English. it's not obvious the best way to gain IQ is to have a larger language model. some other structure might be needed.

1 comments

behnamoh 931 days ago

But isn't that something that even smaller GPT-2 models demonstrated already?

link

erichocean 931 days ago

No, and this is covered in the various Phi papers, as well as TinyStories [0]:

> Models with around 125M parameters such as GPT-Neo (small) or GPT-2 (small) can rarely generate coherent and consistent English text beyond a few words even after extensive training.

[0] https://arxiv.org/abs/2305.07759

link