Hacker News new | ask | show | jobs
by sberens 704 days ago
What is an abstract reasoning task that your average 15 year old (who has "general intelligence") can do that you think LLMs can't do?
11 comments

A 15 year old can reason about how to move their body through a complex obstacle course. They could reason about the nonverbal social cues in a complex interpersonal situation between multiple people, estimate the mood of each person even if there are very few words being exchanged, and determine how different possible actions would affect the situation. They could learn with brief instruction how to control their muscles to climb up a rope. They could learn how to learn so that they become better at a task of their choosing. They can receive new information that permanently changes their understanding of the world. They can learn new tasks for which no massive data set of training data exists. They can perform hierarchical reasoning, like “if I want to fly from San Francisco to New York I first need to buy a plane ticket, then pack my bags, tell my family where I will be going, make sure my phone is charged, walk to the train station, etc etc.”

Also if you ask them a question they can provide you one answer with very little thinking, and then if that’s not good enough they can devote more time to thinking about the answer before they answer again. They can devote arbitrary levels of thinking to any problem depending on what is needed. They can continuously take in new data and continually update their world view throughout their entire existence based on this new information.

There’s actually a huge list of things current autoregressive approaches to AI cannot do, but they can be hard to describe and people don’t like to talk about them so many people actually don’t understand how limited the current systems are.

Here’s a great video where Yann Lecun talks about the limits of autoregressive approaches to AI with many examples:

https://youtu.be/1lHFUR-yD6I

Also: https://sl.bing.net/ep8K7FWVAHY

The quality of your argument is very low. You didn't even bother to check yourself.

That’s fair. In the interview LeCun uses the example of flying from San Francisco to New York and he asserts that these systems are not good at hierarchical reasoning. I’m no expert in this field so I take him at his word but maybe it warrants further explanation.

He also says that such a system wouldn’t be familiar with how to actually move through the world because we don’t have good datasets for how to do so. The rest of what I said still stands. These systems aren’t good at things for which we don’t have massive datasets, and they’re not able to devote different amounts of thinking time to different problems.

What any of what you said has to do with abstract reasoning?
What isn’t abstract about looking at an obstacle course and then imagining how you will move your body? Or looking at someone’s face and imagining how they feel. Isn’t that abstract?
These "it's like a young/stupid person" arguments are wretched. LLMs are interesting but it should be obvious their development is not comparable to the development of human beings.
This.

It's obvious to everyone who isn't willfully blind that LLMs aren't truly intelligent, and all the mental gymnastics that people go through to try to portray LLMs as genuinely intelligent is just so tedious.

Speaking from experience.

More specifically, something like “whats the best brand of phone”. The LLM just summarizes common knowledge. But even a child will grasp some of the differences and have opinions drawn from experience.

Note that this isn’t just an anthro-good argument. AI systems could have experiences and be trained on long duration tasks with memory of what worked and why.

Doing any job for more than an hour without completely forgetting it's goals and tasks
How long do you expect LLMs/agents to be unable to do this?
Good question, I'm working on exactly this, I suppose you could call it the replacement of RAG.

It's actually not very easy to achieve this. I could give a very long winded answer (don't tempt me) but suffice to say it's a resolution problem.

All AI have a fixed resolution on creation. Long running tasks focus on a very particular narrowing space per step, the resolution required for an infinite task is infinite resolution.

No 9s of error will ever fix this.

Funny enough, small animals do this with ease so I strongly disagree the idea that our AI outcompete even small mammals in every way.

Personally, I think that phenomenon (along with "hallucinations") is fundamentally baked into LLMs writ large.

I think LLMs are a dead end on the path to AGI.

I think hallucinations are actually the sign that LLMs are far closer to a real brain than we realize.

I think hallucinations are a major unsearched gateway to AGI.

I agree. Whenever people complain about LLM hallucinations they behave like they never seen one in humans.

Not only humans hallucinate all the time, humans also have persistent hallucinations as evident from the presence of opposing beliefs in various slices of society.

Current LLMs have a number of limitations that human reasoning doesn't. Whether these are intrinsic to the technology or can be overcome with larger and better datasets is an open question.
If you mean LLMs today: Write code that works. More than 100k tokens worth.

Learn something without a megawatt hour of power.

Read a novel and talk about what it really means.

It's extremely ironic you picked megawatt hour of power because that is approximately the amount of power humans need to get good at anything according to the popular proverb.

But don't worry just yet, GPT-4o could not detect the irony on its own either.

differentiating between puppy and a husky in a snowy background without being trained in millions of images?
I wouldn't say humans are so different. You could argue we've been trained on about one quadrillion bytes of visual data by the time we're 4 years old: https://x.com/ylecun/status/1750614681209983231
I would say as counter, a child, pseudo-random training by parents and environment. Not sure what price tag to put on this, but in comparison, LLMs, how many billions, to reach what level of competency exactly?
GPT-4 is also really bad right now about comprehending “new” software libraries (even when I ask it to scrape the web).
Why does it matter how it was trained?
Because that tells us how you approach novel problems. If you need tons of data to solve a novel problem that makes you bad at solving novel problems, while humans can get up to speed in a new domain with much less training and thus solve problems the LLM can't.

Thus AGI needs to be able to learn something new with similar amounts of data as a human, or else it isn't an AGI as it wont be even close to as good as a human at novel tasks.

Counting the number of “r”s in “strawberry”
Drawing a room without an elephant in it.
> your average 15 year old

What's the point of this restriction? It really just presupposes the limitation of LLM, so that any negative points would look moot.

EDIT: Also, I tried to discuss this very specific point w/ GPT, but it didn't really "get" it. 15-year old kids would be able to follow through.

Actually caring about another person.
How do you measure that?
TIL "caring" == "abstract reasoning".