A typical mid-wit response is to say it's just a Markov chain doing naive next token prediction without any semantic model. That's not how deep learning works.
The LLM neural network contains a semantic model and it performs some type of reasoning over that model. The idiot and the genius both can see that ChatGPT has some reasoning capability.
"This 1990 paper demonstrated how neural networks could learn to represent and reason about part-whole hierarchical relationships, using family trees as the example domain.
By training on examples of family relations like parent-child and grandparent-grandchild, the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training.
This seminal work highlighted that neural networks can go beyond just memorizing training examples, and instead learn abstract representations that enable reasoning and generalization"
I would tend to agree with seizethecheese. I don't think anyone says "Markov chain doing naive next token prediction without any semantic model" emphasis on the last part of that sentence. The idiot only says the first part and disregards the last part. The genius says the first part with an understanding that, no matter how complex a semantic model is, the "reasoning" that occurs is completely a function of the training data. Change the data, change the weights. At no point in the model does anything guarantee consistency of logic or anything like that. Certainly, LLMs are not interested in any notion of truth in the philosophical sense.
Here is a succinct example using Claude:
Prompt:
Using the decimal system, redefine all symbols by shifting them one to right such that 0 becomes 1, 1 becomes 2, 2 becomes 3, etc. Now, let's do math. What is 2x3?
===
Okay, with the symbols shifted one to the right in the decimal system:
0 becomes 1
1 becomes 2
2 becomes 3
3 becomes 4
4 becomes 5
5 becomes 6
6 becomes 7
7 becomes 8
8 becomes 9
9 becomes 0 (looping back around)
So if 2 becomes 3, and 3 becomes 4, then:
2 x 3 = 3 x 4 = 12
Shifting those digits one to the right based on the new mapping: 12 becomes 23
Therefore, under this new symbol mapping, 2 x 3 = 23
===
Regardless, if you are interpreting 2x3 to be in the original space, then 2x3 = 6 which would be 7 in the new space. If you are interpreting 2x3 to be in the new space (which is how I would interpret it), then 2x3 is 1x2 = 2 which is 3.
I think it's clear from this example that the LLM has 0 ability to reason.
> I think it's clear from this example that the LLM has 0 ability to reason.
It's not a 0 or 1. You are oversimplifying it. Obviously neural networks can learn to generalize patterns of reasoning inferred from their training data. We know that they are not using explicitly defined formal systems of reasoning, and they have some limitations compared to those systems. Anyone who seriously studied neural networks or machine learning understands this.
By the same logic, practically every human on Earth has "0 ability to reason" as their biological neural network will get confused and make mistakes.
Anyone who has studied neural networks also knows there's no comparison between computer neural networks and human biological neural networks. The name was picked because of a passing familiarity with the biological by someone who didn't have any experience in biological neural networks. It's been sufficiently proven they have no similarity by countless academics.
That is a blatant oversimplification and not true. There are both similarities and differences. New ANN training methods are inspired by studies of biological neural networks (Dropout Regularization is one example)
You can't implement backpropagation biologically. The fact that you didn't even mention spiking neural networks speaks volumes. Those are heavily biologically inspired and yet they have fallen behind ANNs precisely because backpropagation doesn't work on them.
It actually is 0 or 1 in this case. You either have the power of deduction or you do not. You have either proved a theorem or you have not. If you got to a correct conclusion through incorrect means, you have incorrectly reasoned. There is no spectrum in reasoning. Perhaps a spectrum in abilities across humans but not in the logic itself.
> By the same logic, practically every human on Earth has "0 ability to reason" as their biological neural network will get confused and make mistakes.
And therein lies the problem with this whole debate. I think a huge part of the debate is conflating the fact that most humans do not reason well (I wouldn't say they cannot reason) and make mistakes means that reasoning is something fuzzy and make statements like "LLMs reason about as well as humans". Very few humans outside of mathematicians practice logic on a daily basis. Most humans get by with muscle memory and pattern recognition of previous tasks. Just because LLMs are roughly as good as humans at this behavior does not make them able to reason. I would be totally fine if people just replaced "can reason" with "are useful" within their statements so they would look more like "LLMs are as useful as humans in answering MCAT tests." To imply there is a rational actor deriving responses from first order logic is disingenuous in my opinion.
Only if you define reasoning ability as exactly equivalent in capability to a formal theorem prover. But that is a difference in tribe or philosophy. Your Symbolic/Classical rule-based AI tribe versus the Connectionist AI tribe. No point discussing further as it's like arguing Democrat vs Republican. Both approaches have their strengths and weaknesses.
I am defining reason exactly as Wikipedia puts it: "Reason is the capacity of applying logic consciously by drawing conclusions from new or existing information, with the aim of seeking the truth."
There are no tribes here. Republican vs. Democrat, I do not care. If your logic is unsound, I'm going to call you out even if I agree with the conclusion. State your definitions so we can have a formal logic-based debate. For the record, I use neural networks every day and believe that they are incredibly useful and can be purpose-built to beat humans on a large set of tasks. Can they reason? No. Can formal theorem provers reason? No they cannot. They can only verify.
The LLM neural network contains a semantic model and it performs some type of reasoning over that model. The idiot and the genius both can see that ChatGPT has some reasoning capability.
https://www.cs.toronto.edu/~hinton/absps/AIJmapping.pdf
"This 1990 paper demonstrated how neural networks could learn to represent and reason about part-whole hierarchical relationships, using family trees as the example domain.
By training on examples of family relations like parent-child and grandparent-grandchild, the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training.
This seminal work highlighted that neural networks can go beyond just memorizing training examples, and instead learn abstract representations that enable reasoning and generalization"