Hacker News new | ask | show | jobs
by Jensson 1214 days ago
If significantly bigger models than now got better results we would have seen papers about that a long time ago so that the team/company can get more funding, lots of rich actors has worked on that for years.

If it doesn't produce better results however then they want their competitors to waste lots of money to make the same mistakes, there is really no benefit from publishing that and lots of drawbacks.

Otherwise it seems too much of a coincidence that Google and OpenAI ended up with models of basically the same size. Google could have trained a model 5x-10x larger easily, it isn't that expensive to them, but for some reason we didn't see that, and GPT-4 just never seems to launch.

1 comments

It’s not just the cost of training the model, it’s the cost of doing inference at scale. ChatGPT boarder line too expensive to operate already. It’s hard to imagine a larger model that both economical and used by millions of people with our current hardware.
But if a larger model was good enough to replace a human, like for example a Google engineer, then it would still be worth it. So they have for sure tried to scale up, and if that extra scale gave results they would have published something about it.

Now, since the larger model wasn't good enough to replace a human engineer we can rest easy, it wont replace programmers anytime soon. If GPT-4 for example could replace engineers, OpenAI wouldn't need to monetise ChatGPT, they would just rent out artificial engineers to do coding for $10k a year.