| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by throw310822 115 days ago

> just find the most probable word that follows next

Well, if in all situations you can predict which word Einstein would probably say next, then I think you're in a good spot.

This "most probable" stuff is just absurd handwaving. Every prompt of even a few words is unique, there simply is no trivially "most probable" continuation. Probable given what? What these machines learn to do is predicting what intelligence would do, which is the same as being intelligent.

1 comments

qsera 115 days ago

>Probable given what?

The training data..

>predicting what intelligence would do

No, it just predict what the next word would be if an intelligent entity translated its thoughts to words. Because it is trained on the text that are written by intelligent entities.

If it was trained on text written by someone who loves to rhyme, you would be getting all rhyming responses.

It imitates the behavior -- in text -- of what ever entity that generated the training data. Here the training data was made by intelligent humans, so we get an imitation of the same.

It is a clever party trick that works often enough.

empath75 115 days ago

It is impossible to accurately imitate the action of intelligent beings without being intelligent. To believe otherwise is to believe that intelligence is a vacuous property.

slopinthebag 115 days ago

An unintelligent device can accurately imitate the action of intelligent beings within a given scope, in the same way an actor can accurately imitate the action of a fictional character in a given scope (the stage or camera) without actually being that character.

If the idea is that something cannot accurately replicate the entirety of intelligence without being intelligent itself, then perhaps. But that isn't really what people talk about with LLMs given their obvious limitations.

xlii 114 days ago

So the actors who portrait great thinkers are great thinkers?

bonoboTP 114 days ago

No, actors recite a pre-written script. But scriptwriters do have to be great thinkers in order to know what the great thinker would actually say.

kqr 114 days ago

I suppose they really only have to be good at knowing what sort of thing the audience would believe a great thinker would say. As long as the audience does not consist of great thinkers they also cannot know for sure what a great thinker would say.

bonoboTP 114 days ago

That's true for unverifiable "talk professions" where there is no grounding and it's all self-referential navel-gazing chatter.

But LLMs are already beyond that in writing code that passes actual tests, proving theorems that are check able with formal methods etc.

The people who still say LLMs are just parrots in 2026 will just keep saying this no matter what, so I don't think it makes sense to argue this point further.

jeremyjh 114 days ago

Which is why so many portrayals are unconvincing.

qsera 115 days ago

>It is impossible to accurately imitate the action of intelligent beings without being intelligent.

Wait what? So a robot who is accurately copying the actions of an intelligent human, is intelligent?

empath75 115 days ago

That was probably phrased poorly. If a robot can independently accurately do what an intelligent person would do when placed in a novel situation, then yes, I would say it is intelligent.

If it's just basically being a puppet, then no. You tell me what claude code is more like, a puppet, or a person?

qsera 114 days ago

It is neither puppet or a person. It is a computer program.

throw310822 114 days ago

As much as a bundle of an mp3 decoder and a terabyte of mp3 music are "just a program".

UltraSane 115 days ago

How can you distinguish intelligence form a sufficiently accurate imitation of intelligence?

slopinthebag 115 days ago

By "sufficiently accurate" do you mean identical? Because if so, it's not an imitation of intelligence at all, and the question is thus nonsensical.

UltraSane 115 days ago

"it's not an imitation of intelligence at all"

But that is the key insight, how can you tell when an imitation of intelligence becomes the real thing?

throw310822 115 days ago

> The training data

If the prompt is unique, it is not in the training data. True for basically every prompt. So how is this probability calculated?

cbovis 115 days ago

The prompt is unique but the tokens aren't.

Type "owejdpowejdojweodmwepiodnoiwendoinw welidn owindoiwendo nwoeidnweoind oiwnedoin" into ChatGPT and the response is "The text you sent appears to be random or corrupted and doesn’t form a clear question." because the prompt doesnt correlate to training data.

newswasboring 114 days ago

> The prompt is unique but the tokens aren't.

The tokens aren't unique, but the sequence is. Every input this model sees in unique. Even tokens are not as simple as they seem

If you type "ejst os th xspitsl of fermaby?" in ChatGPT it responds with

> It looks like you typed “ejst os th xspitsl of fermaby?”, which seems like a garbled version of:

> "What is the capital of Germany?”

> The capital of Germany is Berlin.

> If you meant to ask something else, feel free to clarify!"

edit: formatting

ajam1507 113 days ago

The prompt does correlate to its training data. In this case, since you sent random text, it generated the most likely response to random text.

HDThoreaun 114 days ago

Or because the text you send was random and doesnt form a clear quesiton?

hmmmmmmmmmmmmmm 115 days ago

...? what is the response supposed to be here?

hmmmmmmmmmmmmmm 115 days ago

Hamiltonian paths and previous work by Donald Knuth is more than likely in the training data.

red75prime 115 days ago

The specific sequence of tokens that comprise the Knuth's problem with an answer to it is not in the training data. A naive probability distribution based on counting token sequences that are present in the training data would assign 0 probability to it. The trained network represents extremely non-naive approach to estimating the ground-truth distribution (the distribution that corresponds to what a human brain might have produced).

qsera 114 days ago

>the distribution that corresponds to what a human brain might have produced..

But the human brain (or any other intelligent brain) does not work by generating probability distribution of the next word. Even beings that does not have a language can think and act intelligent.

hmmmmmmmmmmmmmm 114 days ago

You are always making predictions based on the context. That's why illusions can be so effective like these ones: https://illusionoftheyear.com/cat/top-10-finalists/2024/

astrange 114 days ago

LLMs also don't work by generating probability distributions of the next word. Your explanation isn't able to explain why they can generate words, let alone sentences.

red75prime 114 days ago

[Citation needed] Neuroscience isn't yet at a point when it can say this with any certainty.

Anyway. It's not a theorem that you can be intelligent only if you fully imitate biological processes. Like flight can be achieved not only by the flapping wings.

hmmmmmmmmmmmmmm 113 days ago

Obviously there is some level of memorisation involved. That's why you can even get LLMs to write parts of Harry Potter from scratch with perfect precision.

qsera 115 days ago

Just using a scaled up and cleverly tweaked version of linear regression analysis...

red75prime 115 days ago

That is, the probability distribution that the network should learn is defined by which probability distribution the network has learned. Brilliant!