Hacker News new | ask | show | jobs
by sshine 757 days ago
> Problem is the current systems can’t reason about things

Sounds like the AGI argument trap: They're not able to reason, but we can't succintly define what it is.

I don't come with a reasoning chip. Whatever I call reasoning happens as a byproduct of my neural process.

I do think that the combination of a transformer network and calls to customized reasoning chips (systems that search and deduce answers, like Wolfram Alpha or logic/proof systems) may be a short-stop to something that can perform reason and execution of actions better than humans, but is not AGI.

3 comments

> They're not able to reason, but we can't succintly define what it is.

For transformer-based LLMs, and most LLMs there's an obvious class of problems that they cannot solve. LLMs generally perform bounded computation per token, so they cannot reason about computational problems that are more than linearly complex, for a sufficiently large input instance. If you have a back-and-forth (many shot) your LLM can possibly utilize the context as state to solve harder problems, up to the context window, of course.

Humans can realise they don’t understand something and seek more knowledge to learn to understand it. But also humans can build complex structures out of simple fundamentals: The same logic of counting up beans on a table can be extrapolated to multiplying that table of beans. And then counting horses the same way you count beans but give them a value of multiple beans. And then simplify that by trading in promises of beans in trade of horses.

The fact that so many people can’t see the fundamental differences of an LLM and human intelligence reminds me of back when the very early computer scientists thought they could model the entirety of nature by reducing every “component” to a numeric value and compute it as “transfer of energy”.

Quite literally they did the same thing: They had a new toy (very advanced computation machines) and forced all of nature to “fit” within it. It also ended in failure, obviously. Not because nature or ecosystems (as it was coined) are “magic” but because grossly oversimplifying reality to fit desired models is a fool’s errand.

We’ll have to wait and see how far multi modal training takes us. Text only models are extremely limited by the kind of information we can encode as text and the loss of detail e.g. the word “cat” vs an image of a cat vs video of a cat vs direct physical interaction with a cat vs being a mammal that shares a great deal of biology with a cat. You need a table and beans before you can invent a method for counting them
> LLMs generally perform bounded computation per token, so they cannot reason about computational problems that are more than linearly complex, for a sufficiently large input instance.

I can’t judge if this is true, because I don’t know transformers well, but if it is, it unravels an intuitive thought I’ve never been able to articulate about not only LLMs, but possibly all pattern matching and the human analog of System 1 thinking.

Another fuzzy way of saying this is there’s something irreducible about complexity that can’t be pattern matched by any bounded heuristic – that it’s wishful thinking to assume historical data contains hidden higher-level patterns that unlock magical shortcuts to novel problems.

> it’s wishful thinking to assume historical data contains hidden higher-level patterns that unlock magical shortcuts to novel problems

In the right context, why not? You rely on this everyday to navigate the world with more facility than a newborn.

Have you heard about the different formal notions of complexity and especially Kolmogorov complexity?

Humans have the same limitation and use same solution: showing your work and taking notes. There's no blocker here.
There is a distinction. Humans with the use of an unbounded scratchpad can emulate a general-purpose Turing machine and perform general computation given unbounded time. A LLM is still restricted to its context window which is a comparatively extreme limitation of memory. In comparison, our general-purpose computers have so much memory this isn't something we care about for most practical instances of hard problems that we solve with a classical CS algorithm. You can obviously modify LLMs to perform unbounded computation per token (and furnish it with a scratchpad) but afaict commercial LLMs today don't offer that.
>They're not able to reason, but we can't [succinctly] define what it is.

People also routinely fail to reason, even programmers often write "obvious" logic bugs they don't notice until it gives an unexpected result at which point it's obvious to them. So both humans and AI don't always reason. But humans reason much better.

I myself have observed ChatGPT 4 solving novel problems I invented to my personal satisfaction well enough to say that it seems to have a rudimentary ability to sometimes show abilities we would typically call reasoning, but only at the level of a child. The issue isn't that it is supposed to reason perfectly or that humans reason perfectly, the issue is that it doesn't reason well enough to succeed at completing many kinds of tasks we would like it to succeed at. Please note that nobody expects it to reason perfectly. "Prove Fermat's last theorem in a rigorous way. Produce a proof that can be checked by Coq, Isabelle, Mizar, or HOL in a format supported directly by any of them" is arguably a request that includes nothing but reasoning and writing code. But we would not expect even Wiles to be able to complete it, and Wiles has actually proved Fermat's last theorem.

So we have an idea of reasoning as completing certain types of tasks successfully, and today humans can do it and AI can't.

Today, it fails badly at tasks that require reasoning. A simple example: https://chatgpt.com/share/da95843e-218a-4d69-a161-6aa2d7a3c9...

The issue is that humans can see its answer is wrong and its "reasoning" is wrong.

The issue isn't that it never reasons correctly. It's that it doesn't do so often enough or well enough, and it doesn't complete tasks we expect humans to complete, and it doesn't always notice when it is printing something outrageously wrong and illogical.

It notices sometimes, it engages in elementary rudimentary guesswork sometimes, but just not often enough or well enough.

> Today, it fails badly at tasks that require reasoning. A simple example: https://chatgpt.com/share/da95843e-218a-4d69-a161-6aa2d7a3c9...

> The issue is that humans can see its answer is wrong and its "reasoning" is wrong.

I've noticed with LLMs that they're more likely to come to the wrong conclusion if you prime them in that manner. In this case, you posed the follow-up question as "Will <incorrect conclusion> always be true?" As a result, it's primed to try to prove that incorrect conclusion.

(That said, ChatGPT further did not answer the posed question, as it also changed "difference" -> "absolute difference"; in fact, the difference will alternate between increasing and decreasing, while the absolute difference is strictly increasing.)

Yes, thank you! This exactly matches my experience. The patterns are in there, they're just not prominent or developed enough to reach our level.

That's why I think of GPT3+ as "subhuman AGI," personally.

I suppose it's a question whether what we call "reasoning" is an emergent phenomenon from having enough connections in a graph, or whether it's some other special sauce which we simply don't have in our current models yet. E.g. humans follow a deductive process to answer questions which they haven't encountered yet. Do we gain this ability purely from a denser/larger graph of knowledge, or from a completely different architecture?

I think until we know the answer to this, we can't make predictions about how to build true AGI.

> E.g. humans follow a deductive process to answer questions which they haven't encountered yet.

Rarely, actually.

More generally humans use all kind of inferences where problem at hand is intertwined with all other attention points that is occupying the mental load of the person. Giving a topic full mental attention and finding a path through pure deduction about a circumscribed subject is a rarity, even if you consider only those situations that require any conscious attention at all to perform some action before moving on.

Not within mathematics, where it is the entire sport, and which is the point of contention.
If there is one space where it shines, sure it’s mathematics. But even there, the most notable mathematicians highly rely on some intuitions far before they manage to prove anything, as well as while selecting/creating their conceptual tools to attempt to build the proof, and rarely go to the point of formalizing their points through Coq/Isabelle or even with meticulous paper craft à la Principia Mathematica from Russel and Whitehead.
Except humans correctly believe that a Coq proof is theoretically correct whereas an LLM does not have this meta reasoning ability at all.
All of our deductive reasoning is founded in induction. For example, the basis of all arithmetic is physics analogies regarding things that exist and the understanding that a thing implies another thing is not based in deduction. Similarly, I suspect from my own experience that general reasoning requires a basic understanding of physics if its origin isn't something ineffable. The ability to connect and find implications cannot itself be purely deductive and it would seem to me that an understanding of physical reality would have to be the origin for that ability.
> an emergent phenomenon from having enough connections in a graph, or ... some other special sauce

For humans, it is emergent. But when we reason about reason, we invent special sauce.

If we build our theories of reason into our models, they achieve the strengths and limitations of our models.

If we don't, we're limited by the pace of evolution, because we don't have enough connections in our graph.

So I think we'll have something immediately more useful if we embed ALU special instructions into a neural network.

I must be in the minority here, but I don't think most people exercise any reason. I'd even venture that the vast majority of people haven't reasoned recently at all. In my mind, reasoning is an ability... a willful act to engage in thinking through an abstract problem. Most people don't do this and just use rationalization and learned behavior, which our brains are good at.
Well, 99% of day to day life is mundane for much of living beings on earth. A bee is able to get through it's entire life without showing signs that it deeply ponders about anything.

However, humans have the ability to reason about things (whether most people use this ability is a different question). So then we must ask the question: is this ability just a more advanced form of probabilistic pattern matching, or is it a different architecture altogether? Will current AI models be able to develop this ability, or will we need new models?

People do inference all the time. “Is that driver about to turn?” “Where is the water next to the faucet coming from?” “Does this person like me?”
I think for the most part that's true, but obviously there are things people want to use LLMs for that do require planning/reasoning, and it makes for unexpected failure modes if LLMs don't have this ability.
> humans follow a deductive process to answer questions which they haven't encountered yet

nope. most humans fall in various traps such as pattern recognition, confirmation bias, and many others instead of relying on deductive analysis. Even scientists fail at being rigorous.

Of course there are cases like this, nobody is perfect. But we are talking about mathematics here, not everyday subconscious decision making. I agree that 99% of daily life is trivial pattern recognition. That's not what distinguishes humans though is it? Because animals, down to single celled organisms do just fine without higher order mental capabilities. But we are talking about reasoning here - and specifically about structured one like math.
I disagree that daily life is "trivial pattern recognition".

Just our visual object recognition is immensely powerful and far beyond and current AI. A simple task like walking to the fridge requires a ton of pattern recognition and spatial reasoning. Recognizing people's moods/predicting behaviors is also incredibly involved imo.

Ive said this many times but perhaps we should focus on achieving dog level intelligence first before we start worrying about human level AGI.

Oh I'm very much with you. In fact I get irked by people here breathlessly parroting that human level AGI is upon us any day now. I'd be impressed if an AI had mouse level capabilities any time soon. I think the current models are very impressive, but they are parlor tricks compared to what a true AGI should be capable of.
>if an AI had mouse level capabilities any time soon

That's why nobody has gotten any traction selling access to AIs for $20 a month whereas selling access to mouse labor is such a thriving business.

Just our visual object recognition is immensely powerful and far beyond and current AI.

That's a point you'll likely have to revisit pretty soon. Radiology, for instance, probably won't exist as a profession 20-30 years from now. Captchas are already pretty much done for.

Well 1. Radiology is an insanely niche subject not indiciative of general intelligence, and 2. AI being at good radiology isn't about object recognition or spatial reasoning, its data analysis connecting features to outcomes.

Lastly, check out the ARC challenge or any other spatial reasoning tests for AI. Humans get ~80% on these challenges whereas the best AI is still at 25%