Hacker News new | ask | show | jobs
by breuleux 289 days ago
> So it makes me wonder, is embodiment (advanced robotics) 1000x harder than LLMs from an information processing perspective?

Essentially, yes, but I would go further in saying that embodiment is harder than intelligence in and of itself.

I would argue that intelligence is a very simple and primitive mechanism compared to the evolved animal body, and the effectiveness of our own intelligence is circumstantial. We manage to dominate the world mainly by using brute force to simplify our environment and then maintaining and building systems on top of that simplified environment. If we didn't have the proper tools to selectively ablate our environment's complexity, the combinatorial explosion of factors would be too much to model and our intelligence would be of limited usefulness.

And that's what we see with LLMs: I think they model relatively faithfully what, say, separates humans from chimps, but it lacks the animal library of innate world understanding which is supposed to ground intellect and stop it from hallucinating nonsense. It's trained on human language, which is basically the shadows in Plato's cave. It's very good at tasks that operate in that shadow world, like writing emails, or programming, or writing trite stories, but most of our understanding of the world isn't encoded in language, except very very implicitly, which is not enough.

What trips us up here is that we find language-related tasks difficult, but that's likely because the ability evolved recently, not because they are intrinsically difficult (likewise, we find mental arithmetic difficult, but it not intrinsically so). As it turns out, language is simple. Programming is simple. I expect that logic and reasoning are also simple. The evolved animal primitives that actually interface with the real world, on the other hand, appear to be much more complicated (but time will tell).

5 comments

Nicely said. This all aligns with my intuition, with one caveat.

I think you and I are using different definitions of intelligence. I'm bought into Karl Friston's free energy principle and think it's intelligence all the way down. There is no separating embodiment and intelligence.

The LLM distinction is intelligence via symbols as opposed to embodied intelligence, which is why I really like your shadow world analogy. Without getting caught up in subtle differences in our ontologies, I agree wholeheartedly.

You're right, we probably have different ontologies. To me an intelligent system is a system which aims to realize a goal through modelling its environment and planning actions to bring about that intended state. That's more or less what humans do and I think that's more in line with the colloquial understanding of it.

There are basically two approaches to defining intelligence, I think. You can either define it in terms of capability, in which case a system that has no intent and does not plan can be more intelligent than one that does, simply by virtue of being more effective. Or you can define it in terms of mechanism: something is intelligent if it operates in a specific way. But it may then turn out to be the case that some non-intelligent systems are more effective than some intelligent systems. Or you can do both and assume that there is some specific mechanism (human intelligence, conveniently) that is intrinsically better than the others, which is a mistake people commonly make and is the source of a lot of confusion.

I tend to go for the second approach because I think it's a more useful framing to talk about ourselves, but the first is also consistent. As long as we know what the other means.

If intelligence is treated as a scale, should it be measured primarily by (a) the diversity of valid actions an entity can take combined with its ability to collect and process information about its environment and predict outcomes, or (b) only by its ability to collect and process information and predict outcomes?

In either case, the smallest unit of intelligence could be seen as a component of a two-field or particle interaction, where information is exchanged and an outcome is determined. Scaled up, these interactions generate emergent properties, and at each higher level of abstraction, new layers of intelligence appear that drive increasing complexity. Under such a view, a less intelligent system might still excel in a narrow domain, while a more intelligent system, effective across a broader range, might perform worse in that same narrow context.

Depending on the context of the conversation, I might go along with some cut-off on the scale, but I don't see why the scale isn't continuous. Maybe it has stacked s-curves though...

We just happen to exist at an interesting spot on the fractal that's currently the highest point we can see. So it makes sense we would start with our own intelligence as the idea of intelligence itself.

I think it's an issue of hierarchies and the Society of Mind (Minsky). If a human touches a hot stove, or any animal's end effector, a lower-level process instantly pulls the hand/paw away from the heat. There are no doubt thousands of these 'smart body, no brain' interactions that take over in certain situations, conscious thinking not required.

Ken Goldberg shows that getting robots to operate in the real world using methods that have been successful getting LLMs to do things we consider smart -- getting huge amounts of training data -- seems unlikely. The vastness between what little data a company like Physical Intelligence has vs what GPT-5 uses is shown here: https://drive.google.com/file/d/16DzKxYvRutTN7GBflRZj57WgsFN... 84 seconds

Ken advocates plenty of Good Old-Fashioned Engineering to help close this gap, and worries that demos like Optimus actually set the field back because expectations are set too high. Like the AI researchers who were shocked by LLMs' advances, it's possible something out of left field will close this training gap for robots. I think it'll be at least 5 more years before robots will be among us as useful in-house servants. We'll see if the LLM hype has spilled over too much into the humanoid robot domain soon enough.

> But it may then turn out to be the case that some non-intelligent systems are more effective than some intelligent systems.

That is surely the case on limited scopes. For example the non neural net chess engines are better at chess than any human.

I think that neural networks compare with human intelligence in a fair way, because we should limit their training to the number of games that human professionals can reasonably play in their life. Alphago won't be much good after playing, let's say, 10 thousand games even starting from the corpus of existing human games.

>There is n separating embodiment and intelligence.

And yet whetever IQ you have, it can't make you just play the violin without actually having embodied practice first.

If you have sufficient motor control and dexterity, the amount of required practice should be approximately zero. Just calculate the required finger position and bow orientation, pressure, and velocity for optimal production of the desired sound and do that. That is not how humans perform physical tasks though.
> That is not how humans perform physical tasks though

is it not though? wouldn't it just be that our processing center isn't located completely in the skull as we typically think, but is extended to our spinal cord and nervous system? Something is being processed, you're just not conscious of the entire process. This is especially clear to me as a musician: as you're learning to play, you have to be absolutely aware of all of those processes until you can finally just let go and play!

You've captured a lot here with you shadow world summary. Very well done - I've been feeling this and now you've turned it into words and I'm pretty sure you're correct!
> We manage to dominate the world mainly by using brute force to simplify our environment and then maintaining and building systems on top of that simplified environment. If we didn't have the proper tools to selectively ablate our environment's complexity…

This is very interesting and I feel there is a lot to unpack here. Could you elaborate on this theory with a few more paragraphs (or books / blogs that elucidate this)? In what ways do we use brute force to simplify the environment, and are there not ways in which we use highly sophisticated leveraged methods to simplify our environment tools? What proper tools allow us to selectively ablate complexity? Why does our intelligence only operate on simplified forms?

Also, what would convince you that symbolic intelligence is actually “harder” than embodied intelligence? To me the natural test is how hard it is for each one to create the other. We know it took a few billion years to go from embodied intelligence (ie organisms that can undergo evolution, with enough diversity to survive nearly any conditions on Earth) to sophisticated symbolic intelligence. What if it turns out that within 100 years, symbolic intelligence (contained in LLM like systems) could produce the insights to eg create new synthetic life from scratch that was capable of undergoing self-sustained evolution in diverse and chaotic environments? Would this convince you that actually symbolic intelligence is the harder problem?

Not OP, but several examples:

A. instead of building a house on random terrain with random materials, first we prefer to flatten the place, then we use standard materials (e.g. bricks), which were produced from simple source (e.g. large and relatively homogenous deposit of clay).

B. For mental tasks it’s usual to said, that a person can handle only 7 items at a time (if you disagree multiply by 2-3). But when you ride a bike you process more inputs at the same time (you hear a car behind you, you see person on the right, you feel your balance, you anticipate your direction, if you feel strong wind or sun on your face you probably squint your eyes, you take a breath of air. On top of that all the processes of your body adjust and support your riding: heart, liver, stomach…)

C. “Spherical cows” in physics. (Google this if needed)

> Why does our intelligence only operate on simplified forms?

Part of the issue with discussing this is that our understanding of complexity is subjective and adapted to our own capabilities. But the gist of it is that the difficulty of modelling and predicting the behavior of a system scales very sharply with its complexity. At the end of the scale, chaotic systems are basically unintelligible. Since modelling is the bread and butter of intelligence, any action that makes the environment more predictable has outsized utility. Someone else gave pretty good examples, but I think it's generally obvious when you observe how "symbolic-smart" people think (engineers, rationalists, autistic people, etc.) They try to remove as many uncontrolled sources of complexity as possible. And they will rage against those that cannot be removed, if they don't flat out pretend they don't exist. Because in order to realize their goals, they need to prove things about these systems, and it doesn't take much before that becomes intractable.

One example of a system that I suspect to be intractable is human society itself. It is made out of intelligent entities, but as a whole I don't think it is intelligent, or that it has any overarching intent. It is insanely complex, however, and our attempts to model its behavior do not exactly have a good record. We can certainly model what would happen if everybody did this or that (aka a simpler humanity), but everybody doesn't do this and that, so that's moot. I think it's an illuminating example of the limitations of symbolic intelligence: we can create technology (simple), but we have absolutely no idea what the long term consequences are (complex). Even when we do, we can't do anything about it. The system is too strong, it's like trying to flatten the tides.

> To me the natural test is how hard it is for each one to create the other.

I don't think so. We already observe that humans, the quintessential symbolic intelligences, have created symbolic intelligence before embodied intelligence. In and of itself, that's a compelling data point that embodied is harder. And it appears likely that if LLMs were tasked to create symbolic intelligences, even assuming no access to previous research, they would recreate themselves faster than they would create embodied intelligences. Possibly they would do so faster than evolution, but I don't see why that matters, if they also happen to recreate symbolic intelligence even faster than that. In other words, if symbolic is harder... how the hell did we get there so quick? You see what I mean? It doesn't add up.

On a related note, I'd like to point out an additional subtlety regarding intelligence. Intelligence (unlike, say, evolution) has goals and it creates things to further these goals. So you create a new synthetic life. That's cool. But do you control it? Does it realize your intent? That's the hard part. That's the chief limitation of intelligence. Creating stuff that is provably aligned with your goals. If you don't care what happens, sure, you can copy evolution, you can copy other methods, you can create literally anything, perhaps very quickly, but that's... not smart. If we create synthetic life that eats the universe, that's not an achievement, that's a failure mode. (And if it faithfully realizes our intent then yeah I'm impressed.)

I think a lot of this is true, but not as critical as is being interpreted.

Compare the economics of purely cognitive AI to in-world robotics AI.

Pure cognitive: Massive scale systems for fast, frictionless and incredibly efficient cognitive system deployment and distribution of benefits are solved. On tap even. Cloud computing and the Internet.

What is the amortized cost per task? Almost nothing.

In-world: The cost of extracting raw resources, parts chain, material process chain, manufacturing, distributing, maintaining, etc.

Then what is the amortized cost per task, for one robot?

Several orders of magnitude more expensive, per task! There is no comparison.

Doing that profitably isn’t going to be the norm for many years.

At what price does a kitchen robot make sense? Not at $1,000,000. “Only $100,000?” “Only $25,000? “Only $10k”? Lower than that?

Compared to a Claude plan? That many people still turn down just to use free tier?

Long before general house helper robots makes any economic sense, we will have had walking talking, socializing, profitable-to-build sex robots at higher price points for price insensitive owners.

There are people who will pay high prices for that, when costs come down.

That will be the canary for general robotic servants or helpers.

The cost isn’t intelligence. There isn’t a particular challenge with in-world information processing and control. It’s the cost of the physical thing that processing happens in.

This is a purely economic problem. Not an AI problem at all.

It took about the same amount of time to evolve human-level intelligence as human-level mobility. Pretty much no other animal walks on two legs...
This is interesting to think about. It’s basically just birds and primates. Birds have an ancient evolutionary tree as they are dinosaurs, which did actually walk on two legs. But the gap between dinos and primates walking on two feet, I think, is tens of millions of years. So yea pretty long time.
This makes me think something else, though. Once we were able to reason about the physics behind the way things can move, we invented wheels. From there it's a few thousand years to steam engines and a couple hundred more years to jet planes and space travel.

We may have needed a billion years of evolution from a cell swimming around to a bipedal organism. But we are no longer speed limited by evolution. Is there any reason we couldn't teach a sufficiently intelligent disembodied mind the same physics and let it pick up where we left off?

I like the notion of the LLM's understanding being "shadows on the wall of Plato's cave metaphor," and language may be just that. But math and physics can describe the world much more precisely and, of you pair them with the linguistic descriptors, a wall shadow is not very different from what we perceive with out own senses and learn to navigate.

Note that wheels, steam engines, jet planes, spaceships wouldn't survive on their own in nature. Compared to natural structures, they are very simple, very straightforward. And while biological organisms are adapted to survive or thrive in complicated, ever-changing ecosystems, our machines thrive in sanitized environments. Wheels thrive on flat surfaces like roads, jet planes thrive in empty air devoid of trees, and so on. We ensure these conditions are met, and so far, pretty much none of our technology would survive without us. All this to say, we're playing a completely different game from evolution. A much, much easier game. Apples and oranges.

As for limits, in my opinion, there are a few limits human intelligence has that evolution doesn't. For example, intent is a double-edged sword: it is extremely effective if the environment can be accurately modelled and predicted, but if it can't be, it's useless. Intelligence is limited by chaos and the real world is chaotic: every little variation will eventually snowball into large scale consequences. "Eventually" is the key word here, as it takes time, and different systems have different sensitivities, but the point is that every measure has a half-life of sorts. It doesn't matter if you know the fundamentals of how physics work, it's not like you can simulate physics, using physics, faster than physics. Every model must be approximate and therefore has a finite horizon in which its predictions are valid. The question is how long. The better we are at controlling the environment so that it stays in a specific regime, the more effective we can be, but I don't think it's likely we can do this indefinitely. Eventually, chaos overpowers everything and nothing can be done.

Evolution, of course, having no intent, just does whatever it does, including things no intelligence would ever do because it could never prove to its satisfaction that it would help realize its intent.

Okay, but (1) we don't need to simulate physics faster than physics to make accurate-enough predictions to fly a plane, in our heads, or build a plane on paper, or to model flight in code. (2) If that's only because we've cleared out the trees and the Canada Geese and whatnot from our simplified model and "built the road" for the wheels, then necessity is also the mother of invention. "Hey, I want to fly but I keep crashing into trees" could lead an AI agent to keep crashing, or model flying chainsaws, or eventually something that would flatten the ground in the shape of a runway. In other words, why are we assuming that agents cannot shape the world (virtual, for now) to facilitate their simplified mechanical and physical models of "flight" or "rolling" in the same way that we do?

Also, isn't that what's actually scary about AI, in a nutshell? The fact that it may radically simplify our world to facilitate e.g. paper clip production?

> we don't need to simulate physics faster than physics to make accurate-enough predictions to fly a plane

No, but that's only a small part of what you need to model. It won't help you negotiate a plane-saturated airspace, or avoid missiles being shot at you, for example, but even that is still a small part. Navigation models won't help you with supply chains and acquiring the necessary energy and materials for maintenance. Many things can -- and will -- go wrong there.

> In other words, why are we assuming that agents cannot shape the world

I'm not assuming anything, sorry if I'm giving the wrong impression. They could. But the "shapability" of the world is an environment constraint, it isn't fully under the agent's control. To take the paper clipper example, it's not operating with the same constraints we are. For one, unlike us (notwithstanding our best efforts to do just that), it needs to "simplify" humanity. But humanity is a fast, powerful, reactive, unpredictable monster. We are harder to cut than trees. Could it cull us with a supervirus, or by destroying all oxygen, something like that? Maybe. But it's a big maybe. Such brute force takes requires a lot of resources, the acquisition of which is something else it has to do, and it has to maintain supply chains without accidentally sabotaging them by destroying too much.

So: yes. It's possible that it could do that. But it's not easy, especially if it has to "simplify" humans. And when we simplify, we use our animal intelligence quite a bit to create just the right shapes. An entity that doesn't have that has a handicap.

>Also, isn't that what's actually scary about AI, in a nutshell? The fact that it may radically simplify our world to facilitate e.g. paper clip production?

No, it's more about massive job losses and people left to float alone, mass increase in state control and surveillance, mass brain rot due to AI slop, and full deterioration of responsibility and services through automation and AI as a "responsibility shield".

Something that isn’t obvious when we’re talking about the invention of the wheel: we aren’t actually talking about the round shape thing, we’re actually talking about the invention of the axle which allowed mounting a stationary cart on moving wheels.
And the roadways (later, rails) on which it operates.

Meanwhile, entire civilizations in South America developed with little to no use of wheels, because the terrain was unsuited to roads.

It wasn't actually just terrain. It was actually availability of draft animals, climate conditions and actually most importantly... economics.

Wheeled vehicles aren't inherently better in a natural environment unless they're more efficient economically than the alternatives: pack animals, people carrying cargo, boats, etc.

South America didn't have good draft animals and lots of Africa didn't have the proper economic incentives: Sahara had bad surfaces where camels were absolutely better than carts and sub Saharan Africa had climate, terrain, tsetse flies and whatnot that made standard pack animals economically inefficient.

Humans are smart and lazy, they will do the easiest thing that let's them achieve their goals. This sometimes leads them to local maxima. That's why many "obvious" inventions took thousands of years to create (cotton gin, for example).

Yes, only humans, birds, sifakas, pangolins, kangaroos, and giant ground sloths. Only those six groups of creatures, and various lizards including the Jesus lizard which is bipedal on water, just those seven groups and sometimes goats and bears.
I get what you mean, that’s why the basically is there. Most, kangaroos and some lemurs in your list being the exception, do not move around primarily as bipeds. The ability to walk on two legs occasionally is different than genuinely having two legs and two arms.
And once every while, my cat.
Human-level mobility however is not much to write home about. Just one more variation of the many types seen in animals.

Human level intelligence is, otoh, qualitatively and quantitatively a bigger deal.

I wouldn't agree completely. Being bipedal frees up the hands for, anything, really.

We're better than most animals because we have tools. We have great tools because we have hands.

Birds? Bears whose front paws got injured? https://youtu.be/kcIkQaLJ9r8
Birds didn't develop hands, neither did bears. Also bears can't walk 100km on their hind legs, but we can.
Talking about "time to evolve something" seems patently absurd and unscientific to me. All of nature evolved simultaneously. Nature didn't first make the human body and then go "that's perfect for filling the dishwasher, now to make it talk amongst itself" and then evolve intelligence. It all evolved at the same time, in conjunction.

You cannot separate the mind and the body. They are the same physiological and material entity. Trying anyway is of course classic western canon.

>Nature didn't first make the human body and then go "that's perfect for filling the dishwasher, now to make it talk amongst itself" and then evolve intelligence. It all evolved at the same time, in conjunction.

Nature didn't make decisions about anything.

But it also absolutely didn't "all evolved at the same time, in conjunction" (if by that you mean all features, regarding body and intelligence, at the same rate).

>You cannot separate the mind and the body. They are the same physiological and material entity

The substrate is. Doesn't mean the nature of abstract thinking is the same as the nature of the body, in the same way the software as algorithm is not the same as hardware, even if it can only run on hardware.

But to the point: this is not about separating the "mind and the body". It's about how you can have humanoid form and all the typical human body functions for millions of years before you get human level intelligence, after many later evolution.

>Trying anyway is of course classic western canon.

It's also classic eastern canon, and several others besides.

> The substrate is. Doesn't mean the nature of abstract thinking is the same as the nature of the body, in the same way the software as algorithm is not the same as hardware, even if it can only run on hardware.

In this you are positing the existance of a _soul_ that exists separately from the body, and is portable amongst bodies. Analogues to how an algorithm (disembodied software) exists outside of the hardware and is portable amongst it (by embodying it as software).

I don't not agree with that at all, but it's impossible to know of you're right, but I can at least understand why you have a hard time with my argument and the east-west difference if tradition of the existance of a soul is that "obvious" to you.

I think whether it's "portable amongst bodies" is orthogonal. A specific consciousness of person X can very well only exist within the specific body of person X, and my argument still remains the same (not saying it's right, just that it's not premised on the constraint that there's a soul and it's independent/portable being true).

The argument is that whether consciousness is independent of a specific body or not, it's still of a different nature.

The consciousness part uses the body (e.g. nerve system, neurons etc), but it's nature is the informational exchange and it's essense is not in the construction of the body as a physical machine (though that's its base), but in the stored "weights" encoding memories and world-knowledge.

Same how with a CPU a specific program it runs is not defined by the CPU but the memory contents (data and variables and logic code). It might as well run in an abstract CPU, or one made of water tubes or billiard balls.

Of course in our case, the consciousness runs on a body - and only a specific body - and can't exist without one (same way a program can't exist as a running program without a CPU). But it doesn't mean its of the same nature as the body - just that the body is its substrate.

Plato's "Allegory of the cave" was uninteresting and uninformative when I first read it more than 50 years ago. It remains so today.

https://en.wikipedia.org/wiki/Allegory_of_the_cave

Also, other than in sculpture/dentistry/medicine I also find "ablation" to not be a particularly insightful metaphor either. Although I see ablation's application to LLMs I simply had to laugh when I first read about it: I envisioned starting with a Greyhound bus and blowing off parts until it was a Lotus 7 sports car!8-). Good luck with that! Kind of like fixing the TV set by kicking it (but it _does_ work sometimes!).

Perhaps we should refrain somewhat from applying metaphors/simile/allegories to describe LLMs relative to human intelligence unless they provide some insight of significant value.

>Plato's "Allegory of the cave" was uninteresting and uninformative when I first read it more than 50 years ago. It remains so today.

Anything can be uninteresting and uninformative when one doesn't see it's interestingness or can't grok its information.

It however stood for millenia as a great device to describe multiple layers of abstractions, deeper reality vs appearance, and so on, with utility as such in countless domains.

No. the Allegory is a fragment of a poor unfinished story and little more. You don't need it to explain "multiple layers of abstractions, deeper reality vs appearance" as you say. In fact, you don't need it for anything at all except to explain Plato's "Allegory of the cave". Sheesh.

coldtea says "...with utility as such in countless domains." So when's the last time you referred to the "Allegory of the cave" in your day, other than on HN?

>So when's the last time you referred to the "Allegory of the cave" in your day, other than on HN?

Several times. But it was with broadly educated people, not over-specialized one-dimensional ones.

I don’t think that’s what ablation is about. It’s more like blowing parts off a bus until it ceases to be a bus. Then you find the minimal set of bus parts required to still be a bus, and that’s an indication that those parts are important to the central task of being a bus.
taneq SAYS "i don’t think that’s what ablation is about. It’s more like blowing parts off a bus until it ceases to be a bus."

Different people have different goals. You want some form of minimal bus and I want a Lotus 7. There's no guarantee either of us reach our goal.

Ablation is about disassembling something randomly, whether little by little or on an arbitrary scale until [SOMETHING INTERESTING OR DESIRABLE HAPPENS].

https://en.wikipedia.org/wiki/Ablation_(artificial_intellige...

Ablation is laughable but sometimes useful. It is also easy, mostly brainless, NOT guaranteed to provide any useful information (so you've an excuse for the wasted resources), and occasionally provides insight. It's a good tool for software engineers who have no (or seek no) understanding of their system, so I think of ablation as a "last resort" solutions (e.g., another being to randomly modify code until it "works") that I disdain.

But I'm old so I'm probably wrong! Burn those CPU towers down, boys and girls!