Hacker News new | ask | show | jobs
by coderaptor 946 days ago
I haven’t heard a definition of “reasoning” or “thinking” that proves humans aren’t doing exactly that same probabilistic regurgitation.

I don’t think it’s possible to prove; feels like a philosophical question.

5 comments

I won't define reasoning, just call out one aspect.

We have the ability to follow a chain of reasoning, say "that didn't work out", backtrack, and consider another. ChatGPT seems to get tangled up when its first (very good) attempt goes south.

This is definitely a barrier that can be crossed by computers. AlphaZero is better than we are at it. But it is a thing we do which we clearly don't simply do with the probabilistic regurgitation method that ChatGPT uses.

That said, the human brain combines a bunch of different areas that seem to work in different ways. Our ability to engage in this kind of reason, for example, is known to mostly happen in the left frontal cortex. So it seems likely that AGI will also need to combine different modules that work in different ways.

On that note, when you add tools to ChatGPT, it suddenly can do a lot more than it did before. If those tools include the right feedback loops, the ability to store/restore context, and so on, what could it then do? This isn't just a question of putting the right capabilities in a box. They have to work together for a goal. But I'm sure that we haven't achieved the limit of what can be achieved.

these are things we can teach children to do when they don't do it at first. I don't see why we can't teach this behavior to AI. Maybe we should teach LLM's to play games or something. or do those proof thingys that they teach in US high school geometry or something like that. To learn some formal structure within which they can think about the world
Instead of going bank you can construct a tree of different reasonings with an LLM then take a vote or synthesise see Tee of thought prompting
It feels like humans do do a similar regurgitation as part of a reasoning process, but if you play around with LLMs and ask them mathematical questions beyond the absolute basics it doesn’t take long before they trip up and reveal a total lack of ‘understanding’ as we would usually understand it. I think we’re easily fooled by the fact that these models have mastered the art of talking like an expert. Within any domain you choose, they’ve mastered the form. But it only takes a small amount of real expertise (or even basic knowledge) to immediately spot that it’s all gobbledygook and I strongly suspect that when it isn’t it’s just down to luck (and the fact that almost any question you can ask has been asked before and is in the training data). Given the amount of data being swallowed, it’s hard to believe that the probabilistic regurgitation you describe is ever going to lead to anything like ‘reasoning’ purely through scaling. You’re right that asking what reasoning is may be a philosophical question, but you don’t need to go very far to empirically verify that these models absolutely do not have it.
On the other hand, it seems rather intuitive we have a logic based component? Its the underpinning of science. We have to be taught when we've stumbled upon something that needs tested. But we can be taught that. And then once we learn to recognize it, we intuitively do so in action. ChatGPT can do this in a rudimentary way as well. It says a program should work a certain way. Then it writes it. Then it runs it. Then when the answer doesn't come out as expected (at this point, probably just error cases), it goes back and changes it.

It seems similar to what we do, if on a more basic level. At any rate, it seems like a fairly straight forward 1-2 punch that, even if not truly intelligent, would let it break through its current barriers.

LLMs can be trained on all the math books in the world, starting from the easiest to the most advanced, they can regurgitate them almost perfectly, yet they won't apply the concepts in those books to their actions. I'd count the ability to learn new concepts and methods, then being able to use them as "reasoning".
Aren't there quite a few examples of LLMs giving out-of-distribution answers to stated problems? I think there are two issues with LLMs and reasoning:

1. They are single-pass and static - you "fake" short-term memory by re-feeding the questions with it answer 2. They have no real goal to achieve - one that it would split into sub-goals, plan to achieve them, estimate the returns of each, etc.

As for 2. I think this is the main point of e.g. LeCun in that LLMs in themselvs are simply single-modality world models and they lack other components to make them true agents capable of reasoning.

its those kinds of examples that make it hard to cleave a measurement of success.

Based on those kinds of results an LLM should, in theory, be able to plan, analyze and suggest improvements, without the need for human intervention.

You will see rudimentary success for this as well - however, when you push the tool further, it will stop being... "logical".

I'd refine the point to saying that you will get some low hanging fruit in terms of syntactic prediction and semantic analysis.

But when you lean ON semantic ability, the model is no longer leaning on its syntactic data set, and it fails to generalize.

It’s possible to prove.

Use an LLM to do a real world task that you should be able to achieve by reasoning.

> Use an LLM to do a real world task that you should be able to achieve by reasoning.

Such as explaining the logical fallacies in this argument and the one above?

Take anything, see how far you get before you have to really grapple with hallucination.

Once that happens, your mitigation strategy will end up being the proof.

I mean I know you're joking but yes, it would be able to do that.