How do they keep churning these out this fast? Feels like this kind of technology should take longer to develop, if only through the baby-with-nine-mums-in-one-month adage.
LLMs have been around for a while and they aren't really that different than they were a few years ago tech-wise. The question was always about being able to get good data and compute power for training/running them.
Now that people understand the capabilities of the tech, it's got potential for profit and there's incentive to throw money at it.
OpenAI is treating GPT as a "foundational model". They spend time training the foundational model, then build on top of that. GPT was published may 2020. GPT 3.5 ("text-davinci-003" and "code-davinci-002") shipped a year ago, and ChatGPT was just a fine tuned on top of those.
So they've had plenty of time to increase the training set, improve the architecture and run GPUs full power to get a GPT-4.
LLMs have been around for a while and they aren't really that different than they were a few years ago tech-wise. The question was always about being able to get good data and compute power for training/running them.
Now that people understand the capabilities of the tech, it's got potential for profit and there's incentive to throw money at it.