Hacker News new | ask | show | jobs
by parpfish 611 days ago
I see lots of arguments that LLMs aren't really intelligent because they lack understanding and are "just doing autocomplete". But I never see any precise definitions of what "understanding" is, so it comes across as kind of a hand-wavy defense to make sure that human-like intelligence remains special and that we can say machines don't have it.
2 comments

I agree. For definitions, I'd roughly say:

* If something changes to refine its future behavior in response to its experiences (touch hot stove, get hurt, avoid in future) beyond the immediate/direct effect (withdrawing hand) then it can "learn". I think even small microorganisms can learn, with the main requirement being that it has some mutable state (can't learn if you can't change)

* If something can map modalities into representations in a semantic space (the word "horse" into the concept of a horse) then it can "understand". There are varying degrees to how useful an understanding is (Does the semantic space link related concepts closely together? Can it be used to reason, extract information, and make predictions?). I think current LLMs can, to a certain extent, understand text

* If something has a continually changing internal train of thought (representations of concepts and intentions, evolving over time) then it can "think". I wouldn't say current LLMs think, but that's mostly just down to architecture (no persistent internal state) opposed to any fundamental impossibility

More broadly I believe people already have definitions similar to these, but will then create a distinction between, say, standard "learning" (as above) and then "actual learning" which is something special only attainable by humans (or at least biological brains).

I'm reminded of how our understanding of human object recognition was affected by computer vision research.

For decades we knew that there were neurons with simple receptive fields in V1/V2 that extracted low-level visual features, and that those neurons passed information along the ventral visual stream, and by the end of that processing stream we had neurons in IT that represented different objects.

However, we couldn't really comprehend what sort of algorithm/process was capable of this seemingly magical inference. Coming up with an object representation that was invariant to out of plane rotation was seen as impossibly complex.

But then computer vision came along and showed us that with a relatively simple neuralnet and enough training data... it just kind of works.

Same thing is happening with LLMs right now -- a seemingly impossible, mysterious human capability (e.g., "understanding") isn't as complex as we think. Throw enough data into a network that does pattern matching/autocomplete and human-like intelligence pops out.