Hacker News new | ask | show | jobs
by insomagent 937 days ago
Let's say a model runs through a few iterations and finds a small, meaningful piece of information via "self-play" (iterating with itself without further prompting from a human.)

If the model then distills that information down to a new feature, and re-examines the original prompt with the new feature embedded in an extra input tensor, then repeats this process ad-infinitum, will the language model's "prime directive" and reasoning ability be sufficient to arrive at new, verifiable and provable conjectures, outside the realm of the dataset it was trained on?

If GPT-4,5,...,n can progress in this direction, then we should all see the writing on the wall. Also, the day will come where we don't need to manually prepare an updated dataset and "kick off a new training". Self-supervised LLMs are going to be so shocking.

2 comments

People have done experiments trying to get GPT-4 to come up with viable conjectures. So far it does such a woefully bad job that it isn't worth even trying.

Unfortunately there are rather a lot of issues which are difficult to describe concisely, so here is probably not the best place.

Primary amongst them is the fact that an LLM would be a horribly inefficient way to do this. There are much, much better ways, which have been tried, with limited success.

After a year the entire argument you make boils down to “so far”.
Whereas your post sounds like "Just give the approach more time, it shall continue to incrementally improve until it finally works someday, cuz reasons."

Early attempts at human flight approached it by strapping wings to people's arms and flapping: Do you think that would have eventually worked too, if only we had just given it a bit more time and faith?

> Early attempts at human flight approached it by strapping wings to people's arms and flapping: Do you think that would have eventually worked too, if only we had just given it a bit more time and faith?

Interestingly, we how have human powered aircraft... We have flown ~60km with human leg power alone. We've also got human powered ornithopters (flapping wing designs) which can fly but only for very short times before the pilot is exhausted.

I expect that another 100 years from now, both records will be exceeded, altough probably for scientific curiosity more than because human powered flight is actually useful.

I knew about the legs (there was a model in the London Science Museum when I was a kid), but I didn't know about the ornithopter.

https://en.wikipedia.org/wiki/UTIAS_Snowbird

13 years ago! Wow, how did I miss that?

> Just give the approach more time, it shall continue to incrementally improve until it finally works someday, cuz reasons

Yes. Because we haven't yet reached the limit of deep learning models. GPT-3.5 has 175 billion parameters. GPT-4 has an estimated 1.8 trillion parameters. That was nearly a year ago. Wait until you see what's next.

Why would adding more parameters suddenly make it better at this sort of reasoning? It feels a bit of a “god of the gaps” where it’ll just stop being a stochastic parrot in just a few more million parameters.
I don't think it's guaranteed, but I do think it's very plausible because we've seen these models gain emerging abilities at every iteration, just from sheer scaling. So extrapolation tells us that they may keep gaining more capabilities (we don't know how exactly it does it, though, so of course it's all speculation).

I don't think many people would describe GPT-4 as a stochastic parrot already... when the paper that coined (or at least popularized) the term came up in early 2021, the term made a lot of sense. In late 2023, with models that at the very least show clear signs of creativity (I'm sticking to that because "reasoning" or not is more controversial), it's relegated to reductionistic philosophical arguments, but not really a practical description anymore.

Why would it not? We've observed them getting significantly better through multiple iterations. It is quite possible they'll hit a barrier at some point, but what makes you believe this iteration will be the point where the advanced stop?
Humans and other animals definitely different when it comes to reasoning. At the same time, biologically humans and many other animals are very similar, when it comes to brain, but humans have more "processing power". So it's only natural to expect some emergent properties from increasing number of parameters.
> it’ll just stop being a stochastic parrot in just a few more million parameters.

Is is not a stochastic parrot today. Deep learning models can solve problems, recognize patterns, and generate new creative output that is not explicitly in their training set. Aside from adding more parameters there are new neural network architectures to discover and experiment with. Transformers aren't the final stage of deep learning.

Ever heard of something called diminishingly returns?

The value improvement between 17.5b parameters and 175b parameters is much greater than the value improvement between 175b parameters and 18t parameters.

IOW, each time we throw 100 times more processing power at the problem, we get a measly 2 time increase in value.

Yes that's a good point. But the algorithms are improving too.
You are missing the point that it can be a model limit. LLMs were a breakthrough but that doesn’t mean they are a good model for some other problems, no matter the number of parameters. Language contains more than we thought, as GPT has impressively showed (ie semantics embedded in the syntax emerging from text compression), but still not every intellectual process is language based.
I know that, but deep learning is more than LLMs. Transformers aren't the final ultimate stage of deep learning. We haven't found the limit yet.
Indeed. LLM is an application on a transformer trained with backpropagation. What stops you from adding a logic/mathematic "application" on the same transformer?
Nothing, and there are methods which allow these types of models to learn to use special purpose tools of this kind[1].

[1] https://arxiv.org/abs/2302.04761 Toolformer: Language Models Can Teach Themselves to Use Tools

Yes, it seems like this is a direction to replace RLHF so another way to scale without baremetal and if not this then still just a matter of time before some model optimization outperforms the raw epoch/parameters/token approach.