Hacker News new | ask | show | jobs
by scotty79 1193 days ago
> LLMs "learn" differently to how humans do

Do they? Personally I can't rule out that of LLM model was trained on all of the language a single human heard/read and produced it wouldn't be able to create next utterance that might be indistinguishable from what that human says.

4 comments

The simple fact that a LLM is trained on a gigantic corpus of data and humans learn from a relatively tiny number of interactions with other humans shows that they obviously learn differently.
>humans learn from a relatively tiny number of interactions with other humans

but those interactions are infinitely complex and contain an enormous amount of data

And children are really stupid until they have been exposed to even larger amount of data.
If we took all of the data that a human takes in from all of their senses, I'm not sure if humans use less data.

Humans take in 10 million bits[1] from their eyes every second. 10,000,000 bits/sec * 60 secs/min * 60 mins/hour * 24 hours/day * 1000 days = 108 terabytes. ChatGPT only used 570 GB of training data, so 2 orders of magnitude less data, and that's only counting the visual data.

edit: And that would be for a 3 year old, so comparing ChatGPT's intelligence to a 3 year old shows that ChatGPT comes out favourably.

[1]https://www.sciencedaily.com/releases/2006/07/060726180933.h....

Or that brains have more processing power or a subtly different architecture than current LLMs
Yes, that's granted. The issue is that indistinguishable isnt good enough.

This is the core problem with this schematised (and i think, pseudoscientific) computer science approach to intelligence. Output isnt intelligent.

So, for any given output, it could have been created by system A or system B, whose properties could be radically different.

It matters why, eg., we get "I hate the rain!" as output. If system-A says it because it: cares, hates, muses, imagines, prefers, intends... then that's radically different than if B does so because, "it's combining a weather API with some internet chat history".

I think “indistinguishable” is good enough. As ML generated artifacts become more common seeing flavors of what they generate and common failures will become gradually more obvious. It will keep happening that a new technology will seem very impressive and then after a while the cracks will appear and we’ll all have our sort of internal turing test that separates human from machine.
> Yes, that's granted. The issue is that indistinguishable isnt good enough.

It starts to remind me of "Yes! But it doesn't have a soul!"

If a digital thermometer reads 100C, connected to a black box, are we thereby required to believe that there's boiling water inside the box?

Science doesn't deal with the "indistinguishable". We cannot, on earth, simply distinguish between whether we go around the sun, or the sun goes around the earth.

Does the solar system have a soul?

The world exists, and it has properties, and those are independent of how dumb apes happen to be and what we are in a position to "distinguish" or otherwise.

A system generating text is acting as-if its having its intelligence measured. Each sentence we take to be a symptom of its: having a theory of the enviornment, having something to say about it, having some intention, etc.

When I say, "I don't like what you're wearing!" that sentence itself isnt somehow "intelligent". It is only a valid measure of my caring, preferring, speaking, intending, thinking... because that is why i said it.

A shredder which happened to assemble those words is likewise not intelligent.

This is basic science: measurements arent objects; and measurements have validity criteria which is, at least, the causal properties of the system give rise to those measures.

In the case of ChatGPT no relevant properties give rise to its ouptut. Its sentences are not caused by any intelligence, and aren't valid measures of it.

There is no boiling water. Your digital thermometer is broken.

I agree it's a red herring to focus on output and interactive behavior when discussing this.

If a "shadow prompt" told chatGPT that it writes at a 3rd grade level, we wouldn't argue as much over how smart the bot is.

If it omitted the friendly/helpful/deferential assistant stuff, we'd also argue about it less. Bing's initial defensiveness and aggression made it seem even stupider than the mistakes it was making.

They're honing in on better prompts and other configuration that will make the bot seem smarter. It seems smarter to say "I can't answer that question" than to confidently say something untruthful.

But the underlying computational program (GPT trained on the internet) is the same. If we judge the program's intelligence based on its output, it isn't well defined. The same thing looks intelligent or hilariously unintelligent based on the tokens you (an intelligent person) provide it with.

Or in other words... Suppose we collect all of the system's "intelligent" outputs and disregard the rest. We throw away a lot, the majority of responses, and the resulting set looks impressively smart.

The system appears to demonstrate advanced machine intelligence when restricted to (some?) preimages of this set, even though it acts like a total idiot over other parts of the domain. And it's clear that it takes real knowledge and understanding to solve this boundary problem, so that the calculated image has an "intelligent" shape.

Actually we can determine if the Sun goes around the Earth or the other way around - if we can create an better, more accurate model that have larger predictive power then we can assume this model to be more likely to be correct.

As I understand, this was one initially of the main issues with the new model proposed by Copernicus - it was not more accurate initially.

I don't know, that's Chomsky's claim. If I, again a non-expert, had to take a position on it, it seems far more likely to be true than not. Humans have access to a wide variety of non-language stimuli, and demonstrate signs of intelligence well before they have any functional mastery of language. Even after I "mastered" language, I developed lots of skills that haven't the faintest relation to language, like riding a bike. I'm sure ChatGPT can produce a textual explanation of riding a bike, but neither ChatGPT nor a human who doesn't know how to ride a bike can convert that textual explanation into the act of riding an actual bike. But a human, unlike ChatGPT, could, given a bike, learn to ride it by trying to ride it.
I’d look though at systems like CLIP and Stable Diffusion that are able to map between the language domain and images, as well as music, speech, etc. “Riding a bike” can be seen as a sequence modeling problem too because it is a matter of firing muscle fibers in a certain way and it is a research area to make language-controlled robots that do just that.
I guess the idea is that if I described a static process to an AI, like multiplication, as we have it wired up right now, it wouldn't be able to remember what I told it for years.

I agree this is true, and that it will be a breakthrough for AI, but it's entirely unclear how far away it is in the time dimension.

In humans this is the process of taking short term memory and converting it to long term memory involves a process called consolidation where the structure of the physical brain changes, I guess this would be tantamount to a reweighting of the neural net. It's generally not a one shot thing, especially as the concepts get more complex and have more parts to learn.

One of the things humans do is forget a lot of unimportant crap so we're not constantly rewriting our brains. Of course there is a the issue of how do we make sure we're training our AI how to learn multiplication and not feeding it a diet of junk food information/fake news too.