Hacker News new | ask | show | jobs
by PeterisP 1213 days ago
Scaling of models is a very researched area, and currently all the experiments show that scaling doesn't really get diminishing returns - that was checked in GPT-2 "era" with model sizes from very small up to GPT-2, and reconfirmed with GPT-3 and then with newer models. As far as we can see, scaling does not result in diminishing returns; and while it's certainly possible that we eventually encounter diminishing returns, it is not reasonable to presume that we actually will any time soon (as we have literally zero evidence for that and at least some evidence to the contrary), and even if we will, there's currently no reason to assume that the eventual breaking point is somewhere at "GPT-5" and not "GPT-15" or "GPT-55".
1 comments

If significantly bigger models than now got better results we would have seen papers about that a long time ago so that the team/company can get more funding, lots of rich actors has worked on that for years.

If it doesn't produce better results however then they want their competitors to waste lots of money to make the same mistakes, there is really no benefit from publishing that and lots of drawbacks.

Otherwise it seems too much of a coincidence that Google and OpenAI ended up with models of basically the same size. Google could have trained a model 5x-10x larger easily, it isn't that expensive to them, but for some reason we didn't see that, and GPT-4 just never seems to launch.

It’s not just the cost of training the model, it’s the cost of doing inference at scale. ChatGPT boarder line too expensive to operate already. It’s hard to imagine a larger model that both economical and used by millions of people with our current hardware.
But if a larger model was good enough to replace a human, like for example a Google engineer, then it would still be worth it. So they have for sure tried to scale up, and if that extra scale gave results they would have published something about it.

Now, since the larger model wasn't good enough to replace a human engineer we can rest easy, it wont replace programmers anytime soon. If GPT-4 for example could replace engineers, OpenAI wouldn't need to monetise ChatGPT, they would just rent out artificial engineers to do coding for $10k a year.