Hacker News new | ask | show | jobs
by not-chatgpt 794 days ago
Cool idea. Never gonna work. LLMs are still generative models that spits out training data, incapable of highly abstract creative tasks like research.

I still remember all the GPT-2 based startup idea generators that spits out pseudo-feasible startups.

6 comments

Ignoring the “spits out training data” bit which is at best misleading, it’s interesting that you use the word “abstract” here.

I recently followed Karpathy’s GPT-from-scratch tutorial and was fascinated with how clearly you could see the models improving.

With no training, the model spits out uniformly random text. With a bit of training, the model starts generating gibberish. With further training, the model starts recognizing simple character patterns, like putting a consonant after a vowel. Then it learns syllables, and then words, and then sentences. With enough training (and data and parameters, of course) you eventually yield a model like GPT-4 that can write better code than many programmers.

It’s not always that clear cut, but you can clearly observe it moving up the chain of abstraction as the training loss decreases.

What happens when you go even bigger than GPT-4? We have every reason to believe that the models will be able to think more abstractly.

Your “never gonna work” comment flies in the face of exponential curve we find ourselves on.

If we keep extrapolating eventually GPT will be omniscient. I really can't think of any reason why that wouldn't be the case, given the exponential curve we find ourselves on.
How do you know you're not on a logistic curve?

Don't you think costs and the availability of training data might impose some constraints?

With real world phenomena that have resource constraints anywhere, a good rule of thumb is: if it looks like an exponential curve, walks like an exponential curve, and quacks like an exponential curve, it’s definitely a logistic curve
The entire universe is training data.
It is, but we -- humans, and computers -- are limited in our ability to learn from it. We both learn more easily from structured data, like textbooks.
This has the form of a religious belief.
And also non-religious belief...paradoxical!
I think they're being factitious?
I am. And I think it says a lot about the state of things that many people think I'm being completely serious.
I have asked chat GPT to generate hypotheses on my PhD topic that I know every single piece of existing literature about and it actually threw out some very interesting ideas that do not exist out there yet (this was before they lobotomized it).
Did you try with the API directly? I've had great results with my own prompts, much less so with the chatgpt one.
> (this was before they lobotomized it)

Of course, of course. Because god forbid anyone be able to reproduce your suggestion. Funnily enough I tried the same and have the exact opposite experience.

I think that ship has sailed, if you believe the paper (which I do).

LLMs are already super-human at some highly abstract creative tasks, including research.

There are numerous examples of LLMs solving problems that couldn't be found in the training data. They can also be improved by using reasoning methods like truth tables or causal language. See Orca from Microsoft for example.

they don't just spit out training data, they generalize from training data. They can look at an existing situation and suggest lines of experimentation or analysis that might lead to interesting results based on similar contexts in other sciences or previous research. They're undertrained on bleeding edge science so they're going to falter there but they can apply methodology just fine.
They just need to be better at it than humans, which is a rather low bar when you go beyond two unrelated fields.
When you're this confident and making blanket statements that are this unilateral, that should tell you you need to take a step back and question yourself.