Hacker News new | ask | show | jobs
by xcv123 784 days ago
LLMs have already demonstrated their ability to generally apply abstract rules of logical reasoning to problems outside of their training set. Those are instances of reasoning.

The models are learned from data, and they infer logical reasoning methods implied in the data, but they are not explicitly rules driven and so they produce output which may be logically inconsistent.

Solving natural language processing tasks requires an understanding of the context of words, and this contextual understanding is constructed through a form of reasoning. You can choose to ignore all of the reasoning that they are demonstrably capable of, and narrowly focus on their imperfections, then declare they are capable of "0 reasoning", but that conclusion is an oversimplification. With actual "0 reasoning" ability they wouldn't be able to perform standard NLP tasks such as translation or summarization anywhere close to their current level of accuracy.

The Wikipedia definition refers to consciousness, so that excludes machines. That informal definition is not useful in the context of AI. We need to use technical definitions. Otherwise it is subjective and we end up wasting time squabbling about semantics.

1 comments

> We need to use technical definitions.

I agree. I more or less disagree with all your other claims. A linear model is not reasoning when it extrapolates. Neural networks do not "apply abstract rules of logical reasoning to problems outside of their training set".

> Solving natural language processing tasks requires an understanding of the context of words

Agree.

> and this contextual understanding is constructed through a form of reasoning

Disagree. This models are predictive models. Here is a simple estimator for your height: ((mom's height)+(dad's height))/2. This estimator is very good and modern genomic techniques struggle to beat it. Does this model reason? Obviously not. Yet, the 1B parameter genomics models do reason about the genes? That makes no sense. They are both models. Incapable of reason.

> You can choose to ignore all of the reasoning that they are demonstrably capable of

They have not demonstrated any reasoning. Just because they can solve olympiad level math questions, it does not mean they can reason. This does not take away from the power of these models. These models will have full conversations about reasoning, explain syllogistic logic, and apply different deductions which seemingly show they are reasoning. It's all impressive. Then you are harshly reminded by the following interaction that it is just an amazing ability to memorize:

Again with Claude. Prompt:

A farmer and a goat need to cross a river. There is a boat that can fit exactly one human and one animal. What is the least number of crossings needed to get everything to the other side of the river?

===

Okay, let's approach this classic logic puzzle systematically:

Given:

- There is a farmer, a goat, and a boat that can carry one human and one animal at a time.

- The farmer cannot leave the goat alone with the crop, as the goat will eat the crop.

- The objective is to get the farmer, goat, and the boat across the river using the minimum number of crossings.

To solve this, we need to find a sequence of crossings that adheres to the constraint and achieves the objective in the fewest steps.

Here's one valid solution that requires 7 crossings:

1) Farmer rows the goat across the river.

2) Farmer rows back alone.

3) Farmer rows the crop across the river.

4) Farmer rows the goat back.

5) Farmer rows across again.

6) Goat rows across.

7) Farmer rows the goat across.

Explanation:

- Crossing 1 gets the goat across initially.

- Crossing 2 brings the farmer back to pick up the crop.

- Crossing 3 takes the crop across.

- Crossing 4 brings the goat back so it's not alone with the crop.

- Crossing 5 takes the farmer across.

- Crossing 6 takes the goat across since it's now safe.

- Crossing 7 takes the farmer and goat together to the desired side.

This sequence ensures the goat is never left unattended with the crop, adhering to the constraint.

It can be proven that 7 crossings is the minimum required, as:

- The goat must cross twice (there and back).

- The crop must cross once.

- The farmer must cross at least twice to accompany the goat and crop.

- Additional crossings are needed to reposition the boat.

So the optimal solution involves 2 crossings for the goat, 1 for the crop, and a minimum of 4 additional crossings for the farmer and boat repositioning, totaling 7 crossings.

Therefore, the least number of crossings needed to get the farmer, goat, and boat across the river is 7.

===

All appearances of reasoning are nullified by a response that pulls stuff out of thin air and states it confidently.

I'll refer you back to this 1990 paper by Geoffrey Hinton. Up to you if you want to investigate this further.

If you can prove this is wrong then you should publish your result.

https://www.cs.toronto.edu/~hinton/absps/AIJmapping.pdf

"This 1990 paper demonstrated how neural networks could learn to represent and reason about part-whole hierarchical relationships, using family trees as the example domain.

By training on examples of family relations like parent-child and grandparent-grandchild, the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training.

This seminal work highlighted that neural networks can go beyond just memorizing training examples, and instead learn abstract representations that enable reasoning and generalization"

You've cited this paper multiple times in this thread. I'll go through the exercise of telling you why I do not think this paper shows anything. Hopefully, you will then address one of the many points I've given as counterpoints.

First, your summary of the paper is nowhere to be found in the paper so I assume this is your summary. You say "the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training." This paper does not include training details. It delegates it to another paper in [10]. From the details in this paper, it is trained on 100 out of the 104 total relations. However, there are only 12 distinct relations: mother, husband, wife, son, daughter, uncle, aunt, brother, sister, nephew, niece. That means, each relation is seen ~8 times. Now, your claim is "underlying logical patterns and reason about new family tree instances not seen during training" but that's a gross misrepresentation of what is happening here. First, it's given multiple instances of the same tree with different labels. Second, the inputs appear to be the 24 people involved and so you cannot possibly extend this to new tree topologies. Finally, this to me is the money quote of the paper:

> Does it make use of the isomorphism between the two family trees to allow it to encode them more efficiently and to generalize relationships in one family tree by analogy to relationships in the other? If it does all these things, it seems reasonable to say that it is doing inference rather than mere association.

Now, we have to be careful here because inference might be construed as reasoning. Obviously, the model is performing some type of statistical inference where a model has been posited (3 layer neural network) and the output is being trained (presumably - no training details) to minimize classification error through something like KL which is equivalent to MLE and so it is indeed a statistical inference. This model is so simple, you could manually work out the inference by doing a page full of multiplications. I brought this up before, so I'll ask you to specifically address this point. No one claims linear models perform reasoning. Why are you proposing that this 3 layer (read, 3 matrix multiplies) is doing reasoning?

Yes it's not my summary. I originally learned about the family tree example from a lecture by Geoffrey Hinton. Found some lecture slides here which reference the example but can't find the original lecture right now.

https://www.cs.toronto.edu/~hinton/coursera/lecture4/lec4.pd...

> No one claims linear models perform reasoning. Why are you proposing that this 3 layer (read, 3 matrix multiplies) is doing reasoning?

A 3 layer neural network is a non-linear function. It is not a linear model. There are activation functions between the layers which make it non-linear.