Hacker News new | ask | show | jobs
by benrutter 42 days ago
Not OP, but I think the argument here would be not that LLMs "are not smart" but that smart is just the wrong category of thing to describe an LLM as.

A calculator can do very complex sums very quickly, but we don't tend to call it "smart" because we don't think it's operating intelligently to some internal model of the world. I think the "LLMs are AGI" crowd would say that LLMs are, but it's perfectly consistent to think the output of LLMs is consistent/impressive/useful, but still maintain that they aren't "smart" in any meaningful way.

3 comments

> "we don't think it's operating intelligently to some internal model of the world"

Okay, but you have to actually address why you think LLMs lack an "internal model of the world"

You can train one on 1930s text, and then teach it Python in-context.

They've produced multiple novel mathematical proofs now; Terrance Tao is impressed with them as research assistants.

You can very clearly ask them questions about the world, and they'll produce answers that match what you'd get from a "model" of the world.

What are weights, if not a model of the world? It's got a very skewed perspective, certainly, since it's terminally online and has never touched grass, but it still very clearly has a model of the world.

I'd dare say it's probably a more accurate model than the average person has, too, thanks to having Wikipedia and such baked in.

I should say that quote was referring to a calculator - I wasn't trying to stake a position on LLMs in that comment, more just pointing out that I think its consistent to think they're helpful without thinking they have AGI.

There's obviously a lot more of a case for suggesting LLMs are generally intelligent than a calculator, but for me, I think the key point is that understanding them as "next token generators" is a lot more helpful to explain things like hallucinations and some of the other issues/loops they get into.

For me, if understanding models as "generally intelligent agents operating with an internal model of the world" explained their behaviour better than "next token generators", I'd think calling them "smart" would have some justification[0]. I'm just a person on the internet though, and defining intelligence is pretty rarely clear, even without bringing LLMs into the mix.

[0] In case it's interesting to anyone, I'm basically given a half-baked version of how Daniel Dennet defined intention: https://en.wikipedia.org/wiki/Intentional_stance

I would analogize LLMs to physics simulations in software. Game engines, for example, simulate physics enough to provide a good enough semblance of real-world physics for suspension of disbelief but we would never mistake it for real world physics. Complicated enough simulations, e.g. for weather forecasting, nuclear weapons, or QCD, can provide insights and prove physics theories, but again, experts would never mistake it for real world physics and would be able to explain where the simulation breaks down when trying to predict real world behavior.

Now we have these LLMs that provide some simulation of reasoning merely through prediction of token patterns and that is indeed unexpected and astonishing. However, the AI promoters want to suggest that this simulation of reasoning is human-level reasoning or evolving toward human-level reasoning and this is the same as mistaking game engine physics for real physics. The failure cases (e.g. the walk vs drive to a car wash next door question or the generating an image of a full glass of wine issue), even if patched away, are enough to reveal the token predictor underneath.

Intelligence can be defined as an optimization problem: "find X which maximizes F(X, Y)" where X is the solution, Y is constraints, and F is optimality/fitness criterion. Most other definitions are inane. E.g. "invent an aircraft" can be described as optimization over possible build instructions under given constraints for base materials which optimizes its ability to fly. Absolutely any invention can be formulated as an optimization problem.

It's not like a calculator because LLM can solve very broad classes of problems - you'd struggle to define problems which LLM can't solve (given some fine-tuning, harness, KB, etc).

All this talk about "smartness" isn't even particularly cute...

> It's not like a calculator because LLM can solve very broad classes of problems

I definitely buy this, as least somewhat. Personally I think it'd be a lot more helpful to talk about how "generalisable" a tool is, rather than "general intelligence". LLMs can definitely solve a much broader class of problems than a calculator.

I don't know that "artificial general intelligence" or even "general intelligence" has a very good definition, personally I feel like "solving problems generally" doesn't seem to capture what I mean when I use those kinds of terms. For one, it makes a swiss army knife seem more intelligent than a cat, which personally seems the opposite of what I'd want a good definition of general intelligence to do.

> It's not like a calculator because LLM can solve very broad classes of problems

So can computer programs. Are computer programs intelligent?

A specific program solves only a specific, narrow class of problems.

If you make a program which can solve many different classes of problems that's called AI.

> If you make a program which can solve many different classes of problems that's called AI.

What about Salesforce? That solves a ton of different problems!

And introduces a ton of new problems, too; which is strong evidence that Salesforce is intelligent!