| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by totorovirus 844 days ago
	proves my point that llms are simply a next token predictor. There are many interesting properties that we see "emergence" of intelligence but I think it's just human's incapability to hold so much knowledge on active memory.

4 comments

cornel_io 844 days ago

"Next token predictor" isn't quite the burn that it seems like, because perfect next token prediction would require actual understanding. That's because you can almost always cast any question about understanding into a form where it depends solely on the next token (there are a couple nitpicky exceptions and caveats but not many).

GPT 4 is at a high enough level of performance that mere simple statistics aren't really helping it do any better, it really is developing structures especially in the middle layers that perform some amount of high level understanding.

I don't think that pure next token prediction will always be the optimal way to train and enhance these behaviors, but it's not fair to say that it's unrelated, if this really was just stochastic parroting then LLMs would have topped out way before the level they're at now.

link

SunlitCat 844 days ago

That's the thing. Although given the source of their knowledge is pure condensed wisdom, which is some sort of artificial intelligence, they lack the ability to "think", which is crucial to solve problems.

link

namaria 844 days ago

Mapping of language patterns in vector space is most definitely not "pure condensed wisdom"

link

SunlitCat 844 days ago

Thank you for clarifying this fact. My comment was more about showing signs of intelligence. Maybe I oversimplified my statement too much.

link

potatoman22 844 days ago

LLMs literally are next token predictors, so I'm not understanding your broader point.

link

deadbabe 844 days ago

I think this has always been pretty obvious but the AI faithful have vested interested in insisting that LLM can actually think and solve problems.

link

freejazz 844 days ago

More shocking are those that insists that the human brain must then also work by just guessing the next missing thing. As if the thought process behind I'm hungry starts with "I" and then trying to figure out what next best fits in... it's absurd.

link

ben_w 844 days ago

The token would be the pure sensation of hunger, not the word for self, which is merely a convenient abstraction which we use to share knowledge between outside and over time.

LLMs don't have that sensation (why would they?), that doesn't mean can only be used for text: https://deepgram.com/learn/applications-of-transformer-model...

link

freejazz 844 days ago

Jeez, I don't know how you would think that I thought that LLMs would have a sensation for hunger. That was not even close to my point at all.

link

ben_w 844 days ago

> As if the thought process behind I'm hungry starts with "I"

That sounds like {you think that {people who think LLMs work like humans} believe that {the human sensation of hunger} is merely {saying the phrase "I am hungry"}}.

link