| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hoseja 1195 days ago
	How do they keep churning these out this fast? Feels like this kind of technology should take longer to develop, if only through the baby-with-nine-mums-in-one-month adage.

3 comments

thequadehunter 1195 days ago

Funding and public interest.

LLMs have been around for a while and they aren't really that different than they were a few years ago tech-wise. The question was always about being able to get good data and compute power for training/running them.

Now that people understand the capabilities of the tech, it's got potential for profit and there's incentive to throw money at it.

link

jldugger 1195 days ago

OpenAI is treating GPT as a "foundational model". They spend time training the foundational model, then build on top of that. GPT was published may 2020. GPT 3.5 ("text-davinci-003" and "code-davinci-002") shipped a year ago, and ChatGPT was just a fine tuned on top of those.

So they've had plenty of time to increase the training set, improve the architecture and run GPUs full power to get a GPT-4.

link

hexomancer 1195 days ago

GPT-3 came out almost 3 years ago. If anything this has been too slow compared to previous ones.

link