|
|
|
|
|
by galaxytachyon
1166 days ago
|
|
GPT is a breakthrough in many more ways than just being an advance LLM. GPT3 was released a year or so ago and technically, it was outclassed by PaLM from Google quite a bit in terms of parameter count and Chinchilla in terms of training. What was amazing was they managed to build a scalable system from it, capable of serving millions of users at the same time, and for free. The engineering and backend works must have been astounding and I argue that was the secret sauce for the success of ChatGPT. They did not need to dumb it down or cut corners anywhere. The early releases of ChatGPT and Bing Chat showed they literally put unmodified SOTA models in the hand of users with no price tag attached. These AIs were known a long time ago but only to some people, remember how a bunch of billionaires suddenly got concerned about AIs a year or two ago? I bet they got early access to these LLMs. But only by scaling it up, they can explore the deeper depths of these models and discover new emergent abilities and realize actually how much progress they had made. Before people didn't really expect an LLM to play chess and simulate world models. Now they just found out these things are probably closer to AGI than they thought and the progress bar got pushed forward. Basically my rant is that current progress was made over a long time and people just didn't really realize how far they have come until they opened it up to the public. I would not expect too many surprises in the future on the scale of ChatGPT again. If I am wrong though then we will actually start getting serious candidates for an AGI. |
|
As someone in the field, I largely agree with this take but we're still talking about progress over the course of 5 years or so.
Also, fine-tuning/RLHF considerably advanced the usability of the models by the lay public and hasn't been around for that long.