| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by libraryofbabel 113 days ago
	I have come to think “predict the next token” is not a useful way to explain how LLMs work to people unfamiliar with LLM training and internals. It’s technically correct, but at this point saying that and not talking about things like RLVR training and mechanistic interpretability is about as useful as framing talking with a person as “engaging with a human brain generating tokens” and ignoring psychology. At least AI-haters don’t seem to be talking about “stochastic parrots” quite so much now. Maybe they finally got the memo.

7 comments

qsera 113 days ago

>“predict the next token” is not a useful way

That is the exact thing to say because that is exactly what it does, despite how it does so.

It is not useful to say it if you are an AI-shill though. You bought up AI-hater, so I think I am entitled to bring up AI-shills.

link

vasco 113 days ago

My neurons are also just passing electric signals back and forward and exchanging water and salts with the rest of my body.

link

qsera 113 days ago

> just passing electric signals back and forward

Ok, feel free to call yourselves a toaster, I don't mind!

link

vasco 113 days ago

What, reductionism only works when you do it?

link

qsera 113 days ago

I didn't

link

stephenr 113 days ago

I mean that's really just a comparison to how silicon circuits work though isn't it.

"Thinking rocks" vs "thinking meat sacks" isn't much of a distinction really.

Conversely if you approach conversations the same way an LLM does and just repeat what you've heard other people say a lot without actually knowing what it means then you're also likely to be compared to a feathery chatterbox.

link

dylan604 113 days ago

I think talking to people unfamiliar with LLM training using words like "RLVR training and mechanistic interpretability" is about as useful as a grave robber in a crematorium.

link

libraryofbabel 113 days ago

Obviously you don’t just say those words and leave it at that. Both those things can be explained in understandable terms. And even having a superficial sense of what they are gives people a better picture of what modern LLMs are all about than tired tropes from three years ago like “they’re just trained to predict the next token in the training data, therefore…”

link

goatlover 113 days ago

Must one be an "AI-hater" to use the term "stochastic parrot"? Which is probably in response to all the emergent AGI claims and pointless discussions about LLMs being conscious.

link

measurablefunc 113 days ago

Sampling over a probability distribution is not as catchy as "stochastic parrot" but I have personally stopped telling believers that their imagined event horizon of transistor scale is not going to deliver them to their wished for automated utopia b/c one can not reason w/ people who did not reach their conclusions by reasoning.

link

stephenr 113 days ago

> stochastic parrots

I prefer to use the term "spicy autocomplete" myself.

link

imiric 113 days ago

Technical concepts can be broken down into ideas anyone can understand if they're interested. Token prediction is at the core of what these tools do, and is a good starting point for more complex topics.

On the other hand, calling these tools "intelligent", capable of "reasoning" and "thought", is not only more confusing and can never be simplified, but dishonest and borderline gaslighting.

link

Alex_L_Wood 113 days ago

“Stochastic parrots” only stopped because AI fanboys stopped screaming “AGI” and “it will replace everyone”. Maybe they finally got the memo?

link