Hacker News new | ask | show | jobs
by p-e-w 1126 days ago
> Making that ocean deeper is not a trivial problem that we can just throw more compute or data at.

You can't possibly know that, given that we don't actually understand how LLMs work on a high level.

> We've pretty much tapped out that depth with GPT4

GPT-4 is three months old and you're confident that its working principle cannot be extended further? Where do you get that confidence from?

3 comments

Sam Altman said it himself. He seems like a reasonable source.

If you're familiar with other fields of AI, adding more and more layers to ResNet was the hotness for awhile, but the trick stopped working after awhile.

Exactly, and OpenAI has been around nearly 8 years, consumed huge amount of data with tons of compute. They are just showing us the product now.

It is possible they've reached some 80/20 point and he is pretty honest about how much more extendable the current approach really is.

Would explain going to congress and asking for regulation (of their not-quite-there-yet competitors who they want a regulatory moat against).

Altman didn't really say that. Reading what he actually said rather than a headline, He was alluding to economical walls. He didn't say anything about diminishing returns on scaling. And if anything, the chief scientist, Ilya thinks there's a lot left to squeeze.
Sure Sam Altman, the lying CEO of a tech company (they all do) should be listened to on this matter but not on the part where he thinks AGI within reach using his approach. Selective hearing.
> You can't possibly know that, given that we don't actually understand how LLMs work on a high level.

It's a fair assumption to make however - basically 80/20 rule.

AI research isn't a new thing and I bet you could go back 40/50 years where they thought they were about to have a massive breakthrough to human level intelligence.

> GPT-4 is three months old and you're confident that its working principle cannot be extended further? Where do you get that confidence from?

I'm guessing from actually using it.

GPT4 is super impressive and helpful in a practical way, but having used it myself for a while now I get this feeling also. It feels a bit like "it's been fed everything we have, with all the techniques we have, now what?"

There are dozens and maybe hundreds of different approaches that could theoretically get around the limitations of GPT4 that merely haven't been trained at scale yet. There is absolutely no lack of ideas in this space, including potentially revolutionary ones, but they take time and money to prove out.
I'm sure there are lots of ideas, but it doesn't mean they're any good or will necessarily transform AI to the next level.

It's going to take time to figure out what works and what doesn't.

There's a reason why Sam Altman is saying they're not training GPT5, and it's not because they think GPT4 is good enough.

> ... we don't actually understand how LLMs work on a high level.

Are you saying that people who created ChatGPT don't understand how it works? Or that we the rest of people don't?

Training a model doesn't mean you understand what the neurons actually do to influence output. Nobody knows that. That's where the black box analogies come in. We know what goes in the box and what comes out. We don't know what the box is doing to the data