Hacker News new | ask | show | jobs
by shock-value 1211 days ago
The advancement with these LLMs lies in the fact that they can effectively learn to recognize patterns within “large-ish” input text sequences and probabilistically generate a likely next word given those patterns.

It’s a genuine advancement. However it is still just pattern matching. And describing anything it’s doing as “behavior” is a real stretch given that it is a feed-forward network that does not incorporate any notion of agency, memory, or deliberation into its processing.

3 comments

You are comparing systems that generated completion text based on statistics and correlation with a system that now models actual complex functional relationships between millions of concepts, not just series of letters or words.

The difference is staggering.

It comes about because of the insane level of computational iterations (that are not required for normal statistical completion) mapping vast numbers of terabytes of data into a set of parameters constrained to work together in a way (layers of alternating linear combinations followed by non-linear compressions) that requires functional relationships to be learned in order to compress the information enough to work.

It is a profound difference both in methodology and results.

It's modeling patterns found across the massive corpus of textual training input it has seen -- not the true concepts related by the words as humans understand them. If you don't believe me then ask ChatGPT some bespoke geometry-related brain teasers and see how far it gets.

I want to be clear that the successful scale-up of this training and inference methodology is nonetheless a massive achievement -- but it is inherently limited by the nature of its construction and is in no way indicative of a system that exudes agency or deliberative thought, nor one that "understands" or models the world as a human would.

> [...] no way indicative of a system that exudes agency or deliberative thought, nor one that "understands" or models the world as a human would.

Certainly not - its architecture doesn't model ours. But it has taken a huge step forward in our direction in terms of capabilities, from early to late 2022.

As its reasoning gets better, simply a conversation with itself could become a kind of deliberative thought.

Also, as more data modalities are combined, text with video and audio, human generated and recordings of the natural world, etc., more systematic inclusion of math, its intuition about solving bespoke geometry problems, and other kinds of problems, are likely to improve.

Framing a problem is a lot of the solving of a problem. And we frame geometry with a sensory driven understanding of geometry that the current ChatGPT isn't being given.

>However it is still just pattern matching

the visual cortex in your brain is also "just a pattern matching" system. guess it's not very impressive by your standard.

This[1] isn't my example (it's from another HN user), but if you work as a programmer and you're not absolutely jaw on the floor astonished by this example then I don't know what to say.

Explaining[2] the emergent behaviour is literally cutting edge research. Hand waving this behaviour away as just "probabilistically generating a likely next word" is ignorant.

It's amazing in similar ways to Conway's Game of Life.

[1] https://imgur.com/HOEnxYb

[2]https://ai.googleblog.com/2022/11/characterizing-emergent-ph...

I'm arguing against the notion that these LLMs exhibit "emergent behaviour" as you stated. I don't believe they do, as the term is commonly understood. Emergent behavior usually implies the exhibition of some kind of complexity from a fundamentally simple system. But these LLMs are not fundamentally simple, when considered together with the vast corpus of training data to which they are inextricably linked.

The emergent behavior of Conway's Game of Life arises purely out of the simple rules upon which the simulation proceeds -- a fundamental difference.

did you read the article?

emergent behavior in this context is defined as: "emergent abilities, which we define as abilities that are not present in small models but are present in larger models"

>The emergent behavior of Conway's Game of Life arises purely out of the simple rules upon which the simulation proceeds -- a fundamental difference.

this is a meaningless distinction.

> emergent behavior in this context is defined as: "emergent abilities, which we define as abilities that are not present in small models but are present in larger models"

Then I don't know why you brought up Game of Life because it obviously has nothing to do with this alternative definition of emergent behavior.

> this is a meaningless distinction.

It's meaningful with respect to the claim that LLMs exhibit emergent behavior in the same way in which Game of Life does.

>It's meaningful with respect to the claim that LLMs exhibit emergent behavior in the same way in which Game of Life does.

I said it's amazing in __similar__ ways to Conway's Game of Life.

i.e. a system which behaves in unexpected ways (emergent abilities) and is greater than the sum of its parts.

A propos of [1]

1. Item 3: The ocean is full of floating objects, and it would be hard to see the duck among them? 2. Item 2: is structured as non sequitur, takes a long time because there are many hazards?

I am impressed that you find it impressive. It is plausible-sounding, and I find that disturbing, but it is not useful (and the text prediction paradigm seems a dead end in terms of formulating anything more than plausible sounding)

> the visual cortex in your brain is also "just a pattern matching" system

I can think and a pattern matching system can't.

> you're not absolutely jaw on the floor

Not at all. Autocomplete will autocomplete.

What is thinking?
You are asking the right question.
Technically, it's not pattern matching. It's estimating conditional probabilities and sampling from them (and under the hood, building blocks like QKV attention aka probabilistic hashmap and the optimization used decide what it does anyway, ignoring any theory behind it).