Hacker News new | ask | show | jobs
by abecedarius 503 days ago
Chiang makes some insightful points, e.g. about what we mean by magic.

Then I come to

> [LLMs] can get better at reproducing patterns found online, but they don’t become capable of actual reasoning; it seems that the problem is fundamental to their architecture.

and wonder how an intelligent person can still think this, can be so absolute about it. What is "actual" reasoning here? If an AI proves a theorem is it only a simulated proof?

20 comments

Counter point: what is it about scraping the Internet and indexing it cleverly that makes you believe that would lead to the the creation of the ability to reason above it's programming?

No one in neuroscience, psychology or any related field can point to reasoning or 'consciousness' or whatever you wish to call it and say it appeared from X. Yet we have this West Coast IT cultish thinking that if we throw money at it we'll just spontaneously get there. The idea that we're even 1% close should be ridiculous to anyone rationally looking at what we're currently doing.

> No one in neuroscience, psychology or any related field can point to reasoning or 'consciousness' or whatever you wish to call it and say it appeared from X.

This is not a good argument. Natural systems, the subject of neuroscience/psychology, are much harder to analyze than artificial systems. For example, it's really difficult to study atmospheric gases and figure out Boyle's/Charles law. But put a gas in a closed chamber and change pressure or temperature and these laws are trivially apparent.

LLMs are much more legible systems than animal brains, and they are amenable to experiment. So, it is much more likely that we will be able to identify what "reasoning" is by studying these systems than animal brains.

P.S. Don't think we are there yet, as much as internet commentators might assert.

Yea but following your example/analogy you have gas-gas but brain-llm. So how can we then experiment? It's a simulation at best.
Both jets and birds fly but do it in a completely different way. Who said that there's only one way to achieve reasoning?
This feels like an appropriate place to share this again:

> "The question of whether a computer can think is no more interesting than the question of whether a submarine can swim." - Edsger Dijkstra

Parrots can both fly and talk, what about that!?
This paper may be interesting to some of you:

Discretization of continuous input spaces in the hippocampal autoencoder

https://arxiv.org/pdf/2405.14600

I think it's really up to the highly nebulous definition. Even in your comment is implied that reasoning and consciousness are two names of the same thing but i'd argue one is here and one will never be provable ever. Reason is working through logical steps, much like a program. It's a set of conditions that get checked and a logical structure that uses that information to reach a conclusion. That's what sets it apart from gut feelings or emotional thinking, it's a traceable structure with "reasons". I can watch the LLM speak base facts out loud, then begin to synthesize them giving _reasons_ for the choices it's making, culminating in a final conclusion. It's already doing that. That is what i call reason. It doesn't mean it's human, it doesn't mean it's "aware of itself", it just means it's thinking a train of thought with concrete steps between each car. Consciousness is completely undefinable and useless as a metric and will never be probably achieved.
I agree that reasoning and consciousness are different, however what I do not see being discussed by the AI research community is the necessity to define and then develop "artificial comprehension".

At this point in time, the act of comprehension is a scientific mystery.

I'd say 'consciousness' is the ongoing ever present comprehension of the moment, a feedback self conversation assessing the current situation a being finds itself. This act requires reasoning, as comprehension is the "sandbox" in which reasoning occurs.

But what is comprehension? It's the instantaneous reverse engineering of observations for verification of reality: is what I observe normal, possible or a threat? If one cannot "understand" an observation then the potential the observation is a threat grows. That 'understanding" is reverse engineering the observation to identify it's range of possible behavior and therefore one's safety in relation to that observation.

Comprehension is extremely complex: arbitrary input goes in and a world model with one's safety and next actions comes out.

Thanks for this. Do you have a blog somewhere, preferably with an RSS feed?
I have a never updated blog, but I'm not an active research scientist. I'm just a plain ordinary over educated guy, whose been writing software using AI, across all the things that have been called "AI" for about 45 years. One could say I'm "over read", at one point I'd read every single Nobel Literature winner, I have finished dozens of authors, and my personal taste is mind fuck philosophy in narrative fiction - think Clockwork Orange, Philip K Dick, and beatnik literature. I post a lot of my opinions at Quora: https://www.quora.com/profile/Blake-Senftner
I don't care about qualifications or job titles, if I read a solid piece of text that makes me think differently, I want to know more. ;) I bookmarked your Quora page and blog in my RSS reader, so if you ever start blogging... And thanks for pointing to Philip K Dick, I might actually start reading science fiction.
The assumption is that since there is already a neural network that “got there” (our brains), we should be able to achieve the same thing synthetically.

We just need to figure out how to train that network.

Neural networks are a simplification of our brains, they are not a replication of it. It is just a modeling method that was inspired by how human neurons work, that's it. It's not 1 to 1 or anything.
Furthermore, neurons alone do not led to consciousness. At the very least, their modulation, mainly by glial cells, is essential as well.

Personally, my money is on quantum coherence within microtubules being the mechanism of conscious experience, with the brain essentially being a quantum/classical hybrid computer.

It may be possible to argue that current work in AI leads to some definition of intelligence, which apparently often is equaled to consciousness by some.

My take it is just unaware intelligence like in Peter Watts’ book Blindsight. A terrific read and a quite scary prospect.

It's more that if you actually work with LLMs they will display reasoning. It's not particularly good or deep reasoning (I would generally say they have a superhuman amount of knowledge but are really quite unintelligent), but it is more than simply recall.
Waters are often muddied here by our own psychology. We (as a species) tend to ascribe intelligence to things that can speak. Even more so when someone (or thing in this case) can not just speak, but articulate well.

We know these are algorithms, but how many people fall in love or make friends over nothing but a letter or text message?

Capabilities for reasoning aside, we should all be very careful of our perceptions of intelligence based solely on a machines or algorithms apparent ability to communicate.

>we should all be very careful of our perceptions of intelligence based solely on a machines or algorithms apparent ability to communicate.

I don't think that's merely an irrational compulsion. Communication can immediately demonstrate intelligence, and I think it quite clearly has, in numerous ways. The benchmarks out there cover a reasonable range of measurements that aren't subjective, and there's clear yes-or-no answers to whether the communication is showing real ways to solve problems (e.g. change a tire, write lines of code, solving word problems, critiquing essays), where the output proves it in the first instance.

Where there's an open question is in whether you're commingling the notion of intelligence with consciousness, or identifying intelligence with AGI, or with "human like" uniqueness, or some other special ingredient. I think your warning is important and valid in many contexts (people tend to get carried away when discussing plant "intelligence", and earlier versions of "AI" like Eliza were not the real deal, and Sophia the robot "granted citizenship" was a joke).

But this is not a case, I think where it's a matter of intuitions leading us astray.

> Where there's an open question is in whether you're commingling the notion of intelligence with consciousness

I’m absolutely commingling these two things and that is an excellent point.

Markov chains and other algorithms that can generate text can give the appearance of intelligence without any kind of understanding or consciousness.

I’m not personally certain of consciousness is even requisite for intelligence, given that as far as we know consciousness is an emergent property stemming from some level of problem solving ability.

This seems like the classic shifting of goalposts to determine when AI has actually become intelligent. Is the ability to communicate not a form of intelligence? We don't have to pretend like these models are super intelligent, but to deny them any intelligence seems too far for me.
My intent was not to claim communication isn’t a sign of intelligence, but that the appearance of communication and our tendency to anthropomorphize behaviors that are similar to ours can result in misunderstandings as to the current capabilities of LLMs.

glenstein made a good point that I was commingling concepts of intelligence and consciousness. I think his commentary is really insightful here: https://news.ycombinator.com/item?id=42912765

AI certainly won't be intelligent while it has episodic responses to queries with no ability to learn from or even remember the conversation without it being fed back through as context. This is the current case for LLM models. Token prediction != Intelligence no matter how intelligent it may seem. I would say adaptability is a fundamental requirement of intelligence.
>AI certainly won't be intelligent while it has episodic responses to queries with no ability to learn from or even remember the conversation without it being fed back through as context.

Thank God no one at the AI labs is working to remove that limitation!

The guy in memento is clearly still an intelligent human despite having no memory. These arguments always strike me as coming from a "humans are just special okay!" place. Why are you so determined to find some way in which LLMs aren't intelligent? Why gatekeep so much?
I mean humans have short term and long term memory, short term memory is just our context window.
Are they displaying reasoning, or the outcome of reasoning, leading you to a false conclusion?

Personally, I see ChatGPT say "water doesn't freeze at 27 degrees F" and think "how can it possibly do advanced reasoning when it can't do basic reasoning?"

I'm not saying it reasons reliably, at all (nor has much success with anything particularly deep: I think in a lot of cases it's dumber than a lot of animals in this respect). But it does a form of general reasoning which other more focused AI efforts have generally struggled with, and it's a lot more successful than random chance. For example, see how ChatGPT can be persuaded to play chess. It still will try to make illegal moves sometimes, hallucinating pieces in the board state or otherwise losing the plot. But if you constrain it and only consider the legal moves, it'll usually beat the average person (i.e. someone who understands the rules but has very little experience), even if it'll be trounced by an experienced player. You can't do this just by memorisation or random guessing: chess goes off-book (i.e. into a game state that has never existed before) very quickly, so it must have some understanding of chess and how to reason about the moves to make, even if it doesn't color within the lines as well as a comparatively basic chess engine.

(Basically, I don't think there's a bright line here: saying "they can't reason" isn't very useful, instead it's more useful to talk about what kinds of things they can reason about, and how reliably. Because it's kind of amazing that this is an emergent behaviour of training on text prediction, but on the other hand because prediction is the objective function of the training, it's a very fuzzy kind of reasoning and it's not obvious how to make it more rigourous or deeper in practice)

This is the most pervasive bait-and-switch when discussing AI: "it's general reasoning."

When you ask an LLM "what is 2 + 2?" and it says "2 + 2 = 4", it looks like it's recognizing two numbers and the addition operation, and performing a calculation. It's not. It's finding a common response in its training data and returning that. That's why you get hallucinations on any uncommon math question, like multiplying two random 5 digit numbers. It's not carrying out the logical operations, it's trying to extract the an answer by next token prediction. That's not reasoning.

When you ask "will water freeze at 27F?" and it replies "No, the freezing point of water is 32F", what's happening is that it's not recognizing the 27 and 32 are numbers, that a freezing point is an upper threshold, and that any temperature lower than that threshold will therefore also be freezing. It's looking up the next token and finding nothing about how 27F is below freezing.

Again, it's not reasoning. It's not exercising any logic. Its huge training data set and tuned proximity matching helps it find likely responses, and when it seems right, that's about the token relationship pre-existing in the training data set.

That it occasionally breaks the rules of chess just shows it has no concept of those rules, only that the next token for a chess move is most likely legal because most of its chess training data is of legal games, not illegal moves. I'm unsurprised to find that it can beat an average player if it doesn't break the rules: most chess information in the world is about better than average play.

If an LLM came up with a proof no one had seen, but it checks out, that doesn't prove it's reasoning either, just because it's next token prediction that came up with it. It found token relationships no one had noticed before, but that's inherent in the training data, and not a reflective intelligence doing logic.

When we discuss things like reinforcement learning and chain of reasoning, what we're really talking about are ways of restricting/strengthening those token relationships. It's back-tuning of the training data. Still not doing logic.

Put more succinctly: if it came up with a new proof in math that was then verified, and you went back and said "no, that's wrong" it would immediately present a different proof, denying the validity of its first proof, because it didn't construct anything logical that it can stand on and say "no, I'm right".
These are all examples of how they're not very good at reasoning, not that they don't reason at all. Being a perfectly consistent logical process is not a requirement for reasoning.
I don't think any of us are qualified to tell the difference between exhibiting reasoning and mixing examples taken from the entire internet. Maybe if the training data was small enough to comprehend in its entirety, we could say one way or the other, but as it stands none of us have read the enitre internet, and we have no way of finding the stackoverflow or Reddit conversation that most closely resembles a given chain of thought.
Yes, my judgement too from messing with Claude and (previously) ChatGPT. 'Ridiculous' and 'cultish' are overton-window enforcement more than they are justified.
From its answers I already conclude it is already reasoning above its programming. I do not see why someone in neuroscience or psychology would need to say it appeared, since they do not know better what reasoning is than any average human.

Reasoning is undefined, but a human recognizes it when it appears. I don't see consciousness part of that story. Also, whether you call it emulated or played reasoning or not, apparently does not matter. The results are what they are.

If I write a book that contains Einstein's theory of relativity by virtue of me copying it, did I create the theory? Did my copying of it indicate anything about my understanding of it? Would you be justified to think the next book I write would have anything of original value?

I think what he is trying to say is that LLMs current architecture seems to mainly work by understanding patterns in the existing body of knowledge. In some senses finding patterns could be considered creative and entail reasoning. And that might be the degree to which LLMs could be said to be capable of reasoning or creativity.

But it is clear humans are capable of creativity and reasoning that are not reducible to mere pattern matching and this is the sense of reasoning that LLMs are not currently capable of.

> If I write a book that contains Einstein's theory of relativity by virtue of me copying it, did I create the theory? Did my copying of it indicate anything about my understanding of it? Would you be justified to think the next book I write would have anything of original value?

No, but you described a `cp` command, not an LLM.

"Creativity" in the sense of coming up with something new is trivial to implement in computers, and has long been solved. Take some pattern - of words, of data, of thought. Perturb it randomly. Done. That's creativity.

The part that makes "creativity" in the sense we normally understand it hard, isn't the search for new ideas - it's evaluation of those ideas. For an idea to be considered creative, it has to match a very complex... wait for it... pattern.

That pattern - what we call "creative" - has no strict definition. The idea has to be close enough to something we know, so we can frame it, yet different enough from it as to not be obvious, but still not too different, so we can still comprehend it. It has to make sense in relevant context - e.g. a creative mathematical proof has to still be correct (or a creative approach to proving a theorem has to plausibly look like it could possibly work); creative writing still has to be readable, etc.

The core of creativity is this unspecified pattern that things we consider "creative" match. And it so happens that things matching this pattern are a match for pattern "what makes sense for a human to read" in situations where a creative solution is called for. And the latter pattern - "response has to be sensible to a human" - is exactly what the LLM goal function is.

Thus follows that real creativity is part of what LLMs are being optimized for :).

> For an idea to be considered creative, it has to match a very complex... wait for it... pattern.

If we could predefine what would count as creativity as some specific pattern, then I'm not sure that would be what I would call creative, and certainly wouldn't be an all-inclusive definition of creativity. Nor is creativity merely creating something new by perturbing data randomly as you mentioned above.

While LLMs might be capable of some forms of creativity depending on how you define it, I think it remains to be seen how LLMs' current architecture could on its own accomplish the kinds of creativity implicit in scientific progress in the Kuhnian sense of a paradigm shift or in what some describe as a leap of artistic inspiration. Both of these examples highlight the degree to which creativity could be considered both progress in an objective sense but also be something that is not entirely foreshadowed by its precursors or patterns of existing data.

I think there are many senses in which LLMs are not demonstrating creativity in a way that humans can. I'm not sure how an LLM itself could create something new and valuable if it requires predefining an existing pattern which seems to presuppose that we already have the creation in a sense.

My take on Kuhn's paradigm shift is that it's still incremental progress, but the shift happens at a meta level. I.e., for the scientific example, you need some accumulated amount of observations and hypotheses before the paradigm shift can happen, and while the science "before" and "after" may look hugely different, it's still the case that the insight causing the shift is still incremental. In the periods before paradigm shifts, the science didn't stay still, waiting for a lone genius to make a big conceptual leap that randomly happened to hit paydirt -- if we could do such probability-defying miracles, we'd have special relativity figured out by Ancient Greeks. No, the science just kept accumulating observations and insights, narrowing down the search space until someone (usually several someones around the world, at the same time) was in the right place and context to see the next step and take it.

This kind of follows from the fact that, even if the paradigm-shifting insight was caused by some miracle feat of a unique superhuman genius, it still wouldn't shift anything until everyone else in the field was able to verify the genius was right, that they found the right answer, as oppose to a billion different possible wrong answers. To do that, the entire field had to have accumulated enough empirical evidence and theoretical understanding to already be within one or two "regular smart scholar" leaps from that insight.

With art, I have less experience, but my gut instinct tells me that even there, "artistic inspiration" can be too big a leap from what was before, as otherwise other people would not recognize or appreciate it. Also, unlike science, the definition of "art" is self-referential: art is what people recognize as art.

Still, I think you make a good point here, and convinced me that potential for creativity of LLMs, in their current architecture, is limited and below that of humans. You said:

> While LLMs might be capable of some forms of creativity depending on how you define it, I think it remains to be seen how LLMs' current architecture could on its own accomplish the kinds of creativity implicit in scientific progress in the Kuhnian sense of a paradigm shift or in what some describe as a leap of artistic inspiration.

I think the limit stems strictly from LLMs being trained off-line. I believe LLMs could go as far as making the paradigm-shifting "Kuhnian leap", but they wouldn't be able to increment on it further. Compared to humans, LLMs are all "system 1" and almost none "system 2" - they rely on "intuition"[0], which heavily biases them towards things they've learned before. In a wake of a paradigm shift, a human can make themselves gradually unlearn their own intuitions. LLM's can't, without being retrained. Because of that, the forms of creativity that involve making a paradigm-shifting leap and making a few steps forward from it are not within reach of any current model.

--

[0] - LLMs basically output things that seem most likely given what came before; I think this is the same phenomenon as when humans think and say what "feels like best" in context. However, we can pause and override this; LLMs can't, because they're just run in a forward pass - they neither have an internal loop, nor are they trained for the ability to control an external one.

>Creativity" in the sense of coming up with something new is trivial to implement in computers, and has long been solved. Take some pattern - of words, of data, of thought. Perturb it randomly. Done. That's creativity.

Formal Proof Systems aren't even nearly close to completion, and for patterns we don't have a strong enough formal system to fully represent the problem space.

If we take the P=NP problem, that likely can be solved formally that a machine could do, but what is the "pattern" here that we are traversing here? There is a definitely a deeper superstructure behind these problems, but we can only glean the tips, and I don't think the LLMs with statistical techniques can glean further in either. Natural Language is not sufficient.

Whatever the underlying "real" pattern is, doesn't really matter. We don't need to represent it. People learn to understand it implicitly, without ever seeing some formal definition spelled out - and learn it well enough that if you take M works to classify as "creative" or "not", then pick N people at random and ask each of them to classify each of the works, you can expect high degree of agreement.

LLMs aren't leaning what "creativity" is from first principles. They're learning it indirectly, by being trained to reply like a person would, literally, in the fully general meaning of that phrase. The better they get at that in general, the better they get at the (strict) subtask of "judging whether a work is creative the same way a human would" - and also "producing creative output like a human would".

Will that be enough to fully nail down what creativity is formally? Maybe, maybe not. On the one hand, LLMs don't "know" any more than we do, because whatever the pattern they learn, it's as implicit in their weights as it is for us. On the other hand, we can observe the models as they learn and infer, and poke at their weights, and do all kinds of other things that we can't do to ourselves, in order to find and understand how the "deeper superstructure behind these problems" gets translated into abstract structures within the model. This stands a chance to teach us a lot about both "these problems" and ourselves.

EDIT:

One could say there's no a priori reason why those ML models should have any structural similarity to how human brains work. But I'd say there is a reason - we're training them on inputs highly correlated with our own thoughts, and continuously optimizing them not just to mimic people, but to be bug for bug compatible with them. In the limit, the result of this pressure has to be equivalent to our own minds, even if not structurally equivalent. Of course the open question is, how far can we continue this process :).

As far as I can tell, I think you are interchanging the ability to recognize creativity with the ability to be creative. Humans seem to have the ability to make creative works or ideas that are not entirely derivative from a given data set or fit the criteria of some pre-existing pattern.

That is why I mentioned Kuhn and paradigm shifts. The architecture of LLMs do not seem capable of making lateral moves or sublations that are by definition not derivative or reducible to its prior circumstance, yet humans do, even though the exact way we do so is pretty mysterious and wrapped up in the difficulties in understanding consciousness.

To claim LLMs can or will equal human creativity seems to imply we can clearly define not only what creativity is, but also consciousness and also how to make a machine that can somehow do both. Humans can be creative prima facie, but to think we can also make a computer do the same thing probably means you have an inadequate definition of creativity.

I wrote a long response wrt. Kuhn under your earlier comment, but to summarize it here: I believe LLMs can make lateral moves, but they will find it hard to increment on them. That is, they can make a paradigm-shifting creative leap itself, but they can't then unlearn the old paradigm on the spot - their fixed training is an attractor that'll keep pulling them back.

As for:

> As far as I can tell, I think you are interchanging the ability to recognize creativity with the ability to be creative.

I kind of am, because I believe that the two are intertwined. I.e. "creativity" isn't merely an ability to make large conceptual leaps, or "lateral moves" - it's the ability to make a subset of those moves that will be recognized by others as creative, as opposed to recognized as wrong, or recognized as insane, or recognized as incomprehensible.

This might apply more to art than science, since the former is a moving target - art is ultimately about matching subjective perceptions of people, where science is about matching objective reality. A "too creative" leap in science can still be recognized as "creative" later if it's actually correct. With art, whether "too creative" will be eventually accepted or forever considered absurd, is unpredictable. Which is to say, maybe we should not treat these two types of "creativity" as the same thing in the first place.

> Take some pattern - of words, of data, of thought. Perturb it randomly. Done. That's creativity.

This seems a miopic view of creativity. I think leaving out the pursuit of the implications of that perturbation is leaving out the majority of creativity. A random number generator is not creative without some way to explore the impact of the random number. This is something that LLM inference models just don't do. Feeding previous output into the context of a next "reasoning" step still depends on a static model at the core.

>If I write a book that contains Einstein's theory of relativity by virtue of me copying it, did I create the theory? Did my copying of it indicate anything about my understanding of it? Would you be justified to think the next book I write would have anything of original value?

If you, after copying the book, could dynamically answer questions about the theory, it's implications, and answer variations of problems or theoretical challenges in ways that reflect mainstream knowledge, I think that absolutely would indicate understanding of it. I think you are basically making Searle's chinese room argument.

>But it is clear humans are capable of creativity and reasoning that are not reducible to mere pattern matching and this is the sense of reasoning that LLMs are not currently capable of.

Why is that clear? I think the reasoning for that would be tying it to a notion "the human experience", which I don't think is a necessary condition for intelligence. I think nothing about finding patterns is "mere" insofar as it relates to demonstration of intelligence.

> But it is clear humans are capable of ...

Its not though, nobody really knows what most of the words in that sentence mean in the technical or algorithmical sense, and hence you can't really say whether llms do or don't possess these skills.

>nobody really knows what most of the words in that sentence mean in the technical or algorithmical sense

And nobody really knows what consciousness is, but we all experience it in a distinct, internal way that lets us navigate the world and express ourselves to others, yet apparently some comments seem to dismiss this elephant of sensation in the room by pretending it's no different than some cut and dried computational system that's programmed to answer certain things in certain ways and thus "is probably no different from a person trained to speak". We're obviously, evidentially more than that.

> by pretending it's no different than some cut and dried computational system

This is not really what is going on, what is going on is a mix-up in interpreting the meaning of words, because the meaning of words is not transitive between subject matter unless we arrive at a scientific definition which is leading, and we have not (yet).

When approaching the word consciousness from a spiritual POV, it is clear that LLMs may not possess it. When approaching consciousness from a technical point of view, it is clear that LLMs may possess it in the future. This is because the spiritual POV is anthropologically reductive (consciousness is human), and the technical POV is technically reductive (consciousness is when we can't tell it apart).

Neither statements help us clarify opposing positions because neither definitions are falsifiable, and so not scientific.

I disagree with that characterization. I don’t experience consciousness as an “internal way that lets us navigate the world and express ourselves to others”. To me it is a purely perceptional experience, as I concluded after much introspection. Sure it feeds back into one’s behavior, mostly because we prefer certain experiences over others, but I can’t identify anything in my inner experience that is qualitatively different in nature from a pure mechanism. I do agree that LLMs severely lack awareness (not just self-awareness) and thus also consciousness. But that’s not about being a “mere” computational system.
Words are not reducible to technical statements or algorithms. But, even if they were, then by your suggestion there's not much point in talking about anything at all.
They absolutely are in the context of a technical, scientific or mathematical subject.

Like in the subject of LLMs everyone knows what a "token" or "context" means, even if they might mean different things in a different subject. Yet, nobody knows what "consciousness" means in almost any context, so it is impossible to make falsifiable statements about consciousness and LLMs.

Making falsifiable statements is the only way to have an argument, otherwise its just feelings and hunches with window dressing.

> LLMs current architecture seems to mainly work by understanding patterns in the existing body of knowledge ...

>But it is clear humans are capable of creativity and reasoning that are not reducible to mere pattern matching and this is the sense of reasoning that LLMs are not currently capable of

This is not clear at all. As it seems to me, it's impossible to imagine or think of things that are not in someway tied to something you've already come to sense or know. And if you think I am wrong, I implore you to provide a notion that doesn’t agree. I can only imagine something utterly unintelligible, and in order to make it intelligible, would require "pattern matching" (ie tying) it to something that is already intelligible. I mean how else do we come to understand a newly-found dead/unknown language, or teach our children? What human thought operates completely outside existing knowledge, if not given empirically?

Why can’t creativity be taking the works, a bunch of works, finding a pattern then randomly perturbing a data point/concept to see if there are new patterns.

Then cross referencing that new random point/idea to see if it remains internally consistent with the known true patterns in your dataset.

This is how humans create new ideas often?

I see absolutely zero wrong with that statement. What he said is indeed much more reasoned and intelligent than the average foolish AI hype i've often found here, written by people who try to absurdly redefine the obvious, complex mystery that is consciousness into some reductionist notion of it being anything that presents the appearance of reasoning through technical tricks.

Chiang has it exactly right with his doubts, and the notion that pattern recognition is little different from the deeply complex navigation of reality we living things do is the badly misguided notion.

> Chiang has it exactly right with his doubts, and the notion that pattern recognition is little different from the deeply complex navigation of reality we living things do is the badly misguided notion.

How do you know this?

The same way that we know interpolation of a linear regression is not the same as the deeply complex navigation of reality we do as living things.
I notice that often in these debates someone will make the comparison between a low level mechanism driving LLMs, and a high level emergent behavior of the human mind. I don't think it's deliberate - we don't fully understand how the brain works so we only have emergent behaviors - but how can you be so certain that deeply complex navigation of reality can't emerge from interpolation of a linear regression?
That's a good question. With sufficient dimensionality, interaction terms, and enough linear regressions, I suppose it's possible. But dynamic and reactive coordination of many multiple linear regressions wouldn't be just a linear regression. The output of a linear regression is simplistic just like LLM token prediction is simplistic. Saying something might be a component of eventual intelligence is far from it being intelligence. LLMs are episodic responses to a fixed context by a fixed model that is programmed to predict tokens. Even the CoT models, while more complex, still use a static model with a recursive feed of model outputs back to the model. I think Dr. Chollet does an excellent job of identifying the fundamental difference between a potential AGI and static models in his ARC-AGI papers and presentations.
> but how can you be so certain that deeply complex navigation of reality can't emerge from interpolation of a linear regression?

That was pretty much my question. Why are people so certain on the topic.

I wasn't trying to be flippant but challenge the excessive confidence people have on this topic.
Even most intelligent people can hallucinate, we still haven't fixed this problem. There's a lot of training material and bias which leads many to repeat those things "LLM's are just a stochastic parrot, glorified auto complete/google search, Markov chains, just statistics", etc. The thing is, these sentences sound really good and so it's easy to repeat them when you have made up your mind. It's a shortcut.
I feel like at this point we have to separate LLMs and reasoning models too.

I can see the argument against chatGPT4 reasoning.

The reasoning models though I think get into some confusing language but I don't know what else you would call it.

If you say a car is not "running" the way a human runs, you are not incorrect even though a car can "outrun" any human obviously in terms of moving speed on the ground.

To say since a car can't run , it can't move though is obviously completely absurd.

This was precisely what motivated Turing to come up with the test named after him, to avoid such semantic debates. Yet here we are still in the same loop.

"The terminator isn't really hunting you down, it's just imitating doing so..."

LLMs don’t go into a different mode when they are hallucinating. That’s just how they work.

Using the word “hallucinate” is extremely misleading because it’s nothing like what people do when they hallucinate (thinking there are sensory inputs when there aren’t).

It’s much closer to confabulation, which is extremely rare and is usually a result of brain damage.

This is why a big chunk of people (including myself) think the current LLMs are fundamentally flawed. Something with a massive database to statistically confabulate correct stuff 95% of the time and not have a clue when it’s completely made up is not anything like intelligence.

Compressing all of the content of the internet into an LLM is useful and impressive. But these things aren’t going to start doing any meaningful science or even engineering on their own.

Intelligent people do not "hallucinate" in the same sense that an LLM does. Counterarguments you don't like aren't "shortcuts". There are certainly obnoxious anti-LLM people, but you can't use them to dismiss everyone else.

An LLM does nothing more than predict the next token in a sequence. It is functionally auto-complete. It hallucinates because it has no concept of a fact. It has no "concept", period, it cannot reason. It is a statistical model. The "reasoning" you observe in models like o1 is a neat prompting trick that allows it to generate more context for itself.

I use LLMs on a daily basis. I use them at work and at home, and I feel that they have greatly enhanced my life. At the end of the day they are just another tool. The term "AI" is entirely marketing preying on those who can't be bothered to learn how the technology works.

They’re right until they’re wrong.

AI is (was?) a stochastic parrot. At some point AI will likely be more than that. The tipping point may not be obvious.

> Even most intelligent people can hallucinate, we still haven't fixed this problem.

No we have not, neurodiverse people like me need accommodations not fixing.

It is not hallucination. When people do what we call halucination in chatGPT, it is called "bullshiting", "lying" or "being incompetent".
> and wonder how an intelligent person can still think this, can be so absolute about it. What is "actual" reasoning here?

Large language models excel at processing and generating text, but they fundamentally operate on existing knowledge. Their creativity appears limited to recombining known information in novel ways, rather than generating truly original insights.

True reasoning capability would involve the ability to analyze complex situations and generate entirely new solutions, independent of existing patterns or combinations. This kind of deep reasoning ability seems to be beyond the scope of current language models, as it would require a fundamentally different approach—what we might call a reasoning model. Currently, it's unclear to me whether such models exist or if they could be effectively integrated with large language models.

> True reasoning capability would involve the ability to analyze complex situations and generate entirely new solutions, independent of existing patterns or combinations.

You mean like alphago did in its 36th move?

Isn't that a non-generic 'reasoning-model' instead of something that is reminiscent of the large language model based AIs we use today?

The question is, is it possible to make reasoning models generic and can they be combined with large language models effectively.

Move 37.
"Their creativity appears limited to recombining known information"

There are some theories that this is true for humans also.

There are no human created images that weren't observed first in nature in some way.

For example, Devils/Demons/Angels were described in terms of human body parts, or 'goats' with horns. Once we got microscopes and started drawing insects then art got a lot weirder, but not before images were observed from reality. Then humans could re-combine them.

I understand your point, but it's not comparable:

Humans can suddenly "jump" cognitive levels to see higher-order patterns. Gödel seeing that mathematics could describe mathematics itself. This isn't combining existing patterns, but seeing entirely new levels of abstraction.

The human brain excels at taking complex systems and creating simpler mental models. Newton seeing planetary motion and falling apples as the same phenomenon. This compression isn't recombination - it's finding the hidden simplicity.

Recombination adds elements together. Insight often removes elements to reveal core principles. This requires understanding and reasoning.

>and wonder how an intelligent person can still think this, can be so absolute about it.

I wonder how people write things like this and don't realize they sound as sanctimonious as exactly whatever they are criticizing. Or, if I was to put it in your words: "how could someone intelligent post like this?"

You're right, it was kind of rude. Apologies. I really would rather be wrong, for a reason I gave in another comment.

The thing is, you can interact with this new kind of actor as much as you need to to judge this -- make up new problems, ask your own questions. "LLMs can't think" has needed ever-escalating standards for "real" thinking over the last few years.

Gary Marcus made a real-money bet about this.

I think a better question is "what is the value of thought?" when it came to conclusions such as yours: "I should be rude to this poster because they disagree with me"
To me this also feels like a statement that would obviously need strong justification. For if animals are capable of reasoning, probably through being trained on many examples of the laws of nature doing their thing, then why couldn't a statistical model be?
> For if animals are capable of reasoning

Are they? Which animals? Some seem smart and maybe do it. Needs strong justification.

> probably through being trained on many examples of the laws of nature doing their thing

Is that how they can reason? Why do you think so? Sounds like something that needs strong justification.

> then why couldn't a statistical model be?

Maybe because that is not how anything in the world attained the ability to reason.

A lot of animals can see. They did not have to train for this. They are born with eyes and a brain.

Humans are born with the ability to recognize pattern in what we see. We can tell objects apart without training.

> Needs strong justification.

if animals didn't show problem-solving skills, and thus reasoning, complex ones wouldn't exist anymore by now. Planning is a fundamental skill for survival in a resource-constrained environment and that's how intelligence evolved to begin with.

Assuming that intelligence and by extension reasoning are discrete steps is so backwards to me. They are quite obviously continuously connected all the way back to the first nervous systems.

Are human beings not animals? If animals can't reason, then neither can we.
I agree with Chiang. Reminds me of Searle and The Chinese Room (I agree with Searle too).

I do think that at some point everyone is just arguing semantics. Chiang is arguing that "actual reasoning" is, by definition, not something that an LLM can do. And I do think he's right. But the real story is not "LLMs can't do X special thing that only biological life can do," the real story is "X special thing that only biological life can do isn't necessary to build incredibe AI that in many ways surpasses biological life".

>and wonder how an intelligent person can still think this,

Read up on the ELIZA effect

In theory you can prove a theorem just by enumerating all the possible proofs until you find the one for the theorem you want. This is extremely slow, but do you think there's any reasoning in doing this?

Of course we don't know whether an LLM is doing something like this or actually reasoning. But this is also the point, we don't know.

If you ask a question to a person you can be confident to some degree that they didn't memorize the answer beforehand, so you can evaluate their ability to "reason" and come up with an answer for it. With an LLM however this is increadibly hard to do, because they could have memorized it.

> In theory you can prove a theorem just by enumerating all the possible proofs until ...

An interesting hypothesis! I'm neither a mathematical logician, nor decently up to date in that field - is the possibility of this, at least in the abstract, currently accepted as fact?

(Yes, there's the perhaps-separate issue of only enumerating correct proofs.)

It depends on what theory you're working in (at which point deciding whether to use one theory or another becomes more like a phisolophical question).

I'm mostly familiar with type theory, of which there are many variants, but the most common ones all share the most important characteristics. In particular they identify theorems with types, and proofs with terms, where correct proofs are well-typed terms. The nice thing is that terms are recursively enumerable, so you can list all proofs. Moreover most type theories have decidable type checking, so you can automatically check whether a terms if well-typed (and hence the corresponding proof is correct).

This is not just theory, there exist already a bunch of tools that are being used in practice for mechanically checking mathematical proofs, like Coq, Lean, Agda and more.

When I said "in theory" however it's because in practice enumerating all proof terms will be very very slow and will take forever to reach proofs for theorems that we might find interesting.

Since we're in the LLM topic, there are efforts to use LLMs to speed up this search, though this is more similar to using them as search heuristics though. It does help though that you can have automatic feedback thanks to the aforementioned proof checking tools, meaning you don't need costly human supervision to train them. The hope would be getting something like what Stockfish/Alphazero is for chess.

Has an AI “proven” a theorem?
Leading mathematician Terence Tao says yes, with a lot of guidance.

https://mathstodon.xyz/@tao/113132503432772494

I have a similar opinion of Claude Sonnet. Superhuman knowledge; ability to apply it to solve new math/coding problems at roughly the level of a motivated high-schooler (but not corresponding exactly in profile to anything human); less ability to stay on track the longer the effort takes.

But ChatGPT a couple years ago was at more like grade-school level at problem-solving. What should I call this thing that the best LLMs can do better than the older ones, if it's not actual reasoning? Sparkling syllogistics?

Sorry, that's sarcastic, but... it's from a real exasperation at what seems like a rearguard fight against an inconvenient conclusion. I don't like it either! I think the rate of progress at building machines we don't understand is dangerous. (Understanding the training is not understanding the machinery that comes out.)

Compare the first previews of Copilot with current frontier "reasoning" models, and ask how this will develop in the next five years. Maybe it'll fizzle. If you're very confident it will: I'd like to be convinced too.

you said you said it sarcastically but I like "syllogistic" a lot. We need more volcabulary to describe what LLMs do, and if I tell ChatGPT A implies B implies C, and I tell it A is true, and I can describe that as the LLM syllogisting and not use the words "reasoning" or "thinking", that works for me.

As far as if it will fizzle, even if it does, what we have currently is already useful. Society will take time to adjust to ChatGPT-4's level of capabilities, nevermind whatever OpenAI et al releases next. It can't yet replace a software engineer, but it makes projects possible they previously weren't attempted because they required too much investment previously. So unless you're financially exposed to AI directly (which you might be, many people are!), the question of if it's going to fizzle is more academic than something that demands a rigorous answer. Proofs of a negative are really hard. Reusable rockets were "proven" to be impossible right up until it was empirically proven possible.

I don't see Tao claiming ChatGPT proved a theorem. Moreover most questions seemed to be about something already talked about online, so it seems plausible that it was included in the training data. This is IMO a big issue with evaluating LLMs, you can't keep asking the same questions because you can't be sure they will eventually answer by memory or actually reason.
"or actually reason." how can you be sure it actually do "reasoning" ??? all I can see just made up some nonsense words
That's the point of my comment though...
I think the "LLM is intelligence" crowd has a very simplistic view of people. If you feel that natural language and the systems responsible it are pretty much the only things that human intelligence produces, then I can see the argument.

But I don't believe that. That a machine that can produce convincing human-language chains of thought says nothing about its "intelligence". Back when basic RNNs/LSTMs were at the forefront of ML research, no one had any delusions about this fact. And just because you can train a token prediction model on all of human knowledge (which the internet is not) doesn't mean the model understands anything.

It's surprising to me that the people most knowledgeable about the models often appear to be the biggest believers - perhaps they're self-interestedly pumping a valuation or are simply obsessed with the idea of building something straight from the science fiction stories they grew up with.

In the end though, the burden of proof is on the believers, not the deniers.

> It's surprising to me that the people most knowledgeable about the models often appear to be the biggest believers - perhaps they're self-interestedly pumping a valuation or are simply obsessed with the idea of building something straight from the science fiction stories they grew up with.

"Believer" really is the most appropriate label here. Altman or Musk lying and pretending they "AGI" right around the corner to pump their stocks is to be expected. The actual knowledgeable making completely irrational claims is simply incomprehensible beyond narcissism and obscurantism.

Interestingly, those who argue against the fiction that current models are reasoning, are using reason to make their points. A non-reasoning system generating plausible text is not at all a mystery can be explained, therefore, it's not sufficient for a system to generate plausible text to qualify as reasoning.

Those who are hyping the emergence of intelligence out of statistical models of written language on the other hand rely strictly on the basest empiricism, e.g. "I have an interaction with ChatGPT that proves it's intelligent" or "I put your argument into ChatGPT and here's what it said, isn't that interestingly insightful". But I don't see anyone coming out with any reasoning on how ability to reason could emerge out of a system predicting text.

There's also a tacit connection made between those language models being large and complex and their supposed intelligence. The human brain is large and complex, and it's the material basis of human intelligence, "therefore expensive large language models with internal behavior completely unexplainable to us, must be intelligent".

I don't think it will, but if the release of the deepseek models effectively shifts the main focus towards efficiency as opposed to "throwing more GPUs at it", that will also force the field to produce models with the current behavior using only the bare minimum, both in terms of architecture and resources. That would help against some aspects of the mysticism.

The biggest believers are not the best placed to drive the research forward. They are not looking at it critically and trying to understand it. They are using every generated sentence as a confirmation of their preconceptions. If the most knowledgeable are indeed the biggest believers, we are in for a long dark (mystic) AI winter.

Can an LLM discover a new theory of natural law, such as prove or disprove string theory? This is what I ponder. The creation or discovery of something new that it can’t just copy, that would have to be discovered via human thought otherwise. Something provably true. The equivalent of discovering general relativity before humans had.
I feel like there is likely far more useful answers in overlooked science then there is in "new" science.
I mean the vast vast majority of people cannot prove/disprove theorems either, but we still consider ourselves as intelligent.
I’m not demanding it discover new science before it can be called intelligent. It’s a thought experiment.
Sounds to me like a sci-fi author exploring his thoughts. Perhaps a full treatment of the subject wasn’t on his mind that day. I also don’t expect the article’s author to include anything they feel is mundane to themselves. Or to only include what they personally found interesting.
> how an intelligent person can still think this

Cognitive neuroscience

“qualia”

Ray Kurzweil

I’ll take “things OP doesn’t know about that an intelligent person does” for 800 Alex.

If you’re enamored with LLMs and can’t see the inherent problems, you don’t actually know about AI and machine learning.

Chiang is a silly blob of meat, of course he's not capable of actual reasoning, much less "intelligence".

We grant him personhood, but personhood, like the LA Review of Books, is just a social construct.

This is a good point.

If you prick an LLM does it not bleed? If you tickle it does it not laugh? If you poison one does it not die? If you wrong an LLM shall it not revenge?

What is intelligent person? You seem to have approached the article with an existing reverence.

Rather 1984 to look at the contribution of an academic and an iron welder and see authority in someone who memorized the book, but not how to keep themselves alive. Chiang and the like are nihilists, indifferent if they die cause it all just goes dark to them. Indifferent to the toll they extract from labor to fly their ass around speaking about glyphs in a textbook. Academics detached from the real work people need are just as draining on society and infuriating as a billionaire CEO and tribal shaman. Especially these days when they derive some small normalization from 100s of years of cataloged work and proclaim their bit of syntactic art is all they should need to spend the rest of their life being celebrated like they’re turning 8 all over again.

Grigori Perelman is the only intelligent person out there I respect. Copy-paste college grads all over the US recite the textbook and act like it’s a magical incantation that bends the will of others. Cult of social incompetence in the US.

Please, wake me up when artificial so-called intelligence will have proved a new theorem.
Ted Chiang revealed himself at his NeurIPS "Pluralism and Creativity" workshop to be... a great book author and not much else. His statements during his panels with the other AI researchers proved that he was not up to date on modern AI research.

He's overly sentimental, and so are his books. I wish there were other sci-fi authors that the AI community wanted to contact but after "Arrival" I get it since "Arrival" is the literal wet-dream of many NLP/AI researchers.