Hacker News new | ask | show | jobs
by p1esk 1142 days ago
OK, you have a point. But, if you replace "GPT-4" with "an average human" in your response - it still makes sense.
1 comments

When you ask a question of "an average human", they will reply with a sentence answering your question, or maybe asking for clarifications, etc. When you ask a question of GPT-4, it generates one token.

One. Token.

One token, that is not an answer to your question, that is not asking for clarifications, that is not trying to reason about your question, or how to answer it, because it's just one token.

One. Token.

For example, say you ask:

  "What is the sieve of Eratosthenes? Speak in chicken".
Then GPT-4 will answer with something like:

  Chicken
You don't see that because you don't interact with the GPT-4 model directly. You interact with ChatGPT, a "wrapper" around GPT-4, that takes your prompt, passes it to the model, then takes the token returned by the model and decides what to do with it. In the example above, ChatGPT will take the " Chicken" token generated by GPT-4 and append it to your question:

  What is the sieve of Eratosthenes? Speak in chicken. Chicken
Then it will send this new string back to the model, which will generate a new token:

  chicken
And then it will go like this:

  Iteration 2: What is the sieve of Eratosthenes? Speak in chicken. Chicken chicken

  ... 

  Iteration k: What is the sieve of Eratosthenes? Speak in chicken. Chicken chicken chicken chicken chicken chicken chicken chicken chicken chicken ...
At no point is GPT-4 trying to reason about your question, or try to answer your question, or do anything else than generate one. token. at a time. There's no thinking, no reasoning, no calculation, no logic, no deduction, no intelligence, no anything. It's only token, token, token. Chicken, chicken, chicken.

And do you know when the chickens stop? When GPT-4 generates a special and magickal token, called a stop-token (or a "stop sequence" in OpenAI docs). That's a token, not found in the training corpus, added to the end of every string during tokenisation. That's how ChatGPT knows to stop sending back your prompt + generated tokens, to the model. It can't look back to what GPT-4 has generated so far, because it doesn't understand any of that. Because it doesn't understand anything, and therefore cannot reason about your question, or realise it has answered it. It cannot do anything except a) ask GPT-4 to generate another token or b) stop asking for more tokens.

  "What is the sieve of Eratosthenes? Speak in chicken". Chicken, chicken, chicken, chicken, .... chicken, stop_token!
No more chickens.

And that's how GPT-4 explains what the Seive of Eratosthenes is, but in chicken.

So what you see as a user is like watching a movie where a plate of spaghetti is flying through the air, lands on a table cloth, the tablecloth lands on a table, the table lands on the floor, four chairs land around it and suddendly there's people in the chairs eating the spaghetti. It's not that someone has figured out how to reverse time: it's a recording, played backwards. It looks like things are going backwards, but they're not.

It looks like ChatGPT is answering your questions, but it's not. It looks like ChatGPT is interacting with you, but it's not. It's a magick trick.

Prompt: What is the sieve of Eratosthenes? Speak in chicken

Answer from GPT-4:

Sure, here's the explanation of the sieve of Eratosthenes explained in a simplified "chicken" style:

Imagine you are a chicken. You have lots of chicken friends. Let's say there are 100 of you. Now, you all stand in a line. Each chicken is numbered, from 2 up to 100. Chicken number 1 decided to stay at home because he's a bit special in this game.

Chicken number 2, a very important chicken, starts to crow. Every chicken in line after number 2 who has a number that is a multiple of 2 (so every second chicken from chicken number 2) gets scared and runs away.

Then chicken number 3, the next chicken in line who didn't run away, starts to crow. Every chicken in line after number 3 who has a number that is a multiple of 3 (so every third chicken from chicken number 3) also gets scared and runs away.

This continues with the next chicken in line who didn't run away, and so on. Each chicken scares away all the chickens whose numbers are multiples of their own number.

At the end, the chickens that are still in line, who didn't run away, are very special chickens. Their numbers are what humans call prime numbers. These are numbers that can only be divided by 1 and themselves.

This whole chicken game is like the Sieve of Eratosthenes. It's a way that humans have figured out to find prime numbers. You start from the lowest number (2) and eliminate multiples of each number. The numbers that are left over are the primes.

And that, in chicken speak, is the Sieve of Eratosthenes!

Bad prompt. Try "Only use the word chicken" instead of "Speak in chicken".
Prompt: What is the sieve of Eratosthenes? Only use the word "chicken".

Answer from GPT-4:

The chicken is in a chicken coop with many other chickens. Each chicken represents a number in the chicken coop. The first chicken that hasn't been touched, after the chicken representing one, is a prime chicken. We'll call this chicken the prime chicken.

Now, imagine that the prime chicken has a unique chicken dance. Every chicken that copies this prime chicken's dance is removed from the coop because they are not prime chickens. These are the chickens that would be multiples of our prime chicken.

Then we find the next chicken that hasn't been touched and doesn't know the dance of the previous prime chicken. This chicken is also a prime chicken. This prime chicken has a different chicken dance, and all the chickens that copy this new dance are also removed from the coop.

We keep doing this - finding the next chicken that hasn't danced yet, designating it as a new prime chicken with its unique dance, and removing any chickens that copy the dance - until there are no more new dances.

All the chickens left in the coop, who haven't copied any dances, are the prime chickens. This is the chicken way to understand the Sieve of Eratosthenes. The prime chickens are like the prime numbers in the Sieve of Eratosthenes.

----

I then followed up with: "You must only use the word "chicken". Do not use any other words."

Then it wrote an entire page filled with "chicken chicken chicken"

I then asked: "By what logic did you construct your response? Why not use just one occurrence of the word "chicken" and finish there?"

Response from GPT-4:

Apologies for any confusion. Your question seemed to request a description of the Sieve of Eratosthenes using only the word "chicken". While it's true that a single "chicken" would technically meet your request, it seemed that you were interested in a more elaborate or extended response, hence the repetition.

In any case, attempting to convey any meaningful information using only a single word—let alone a complex concept like the Sieve of Eratosthenes—is inherently impossible due to the lack of distinct symbols or constructs to represent different aspects of the concept. Thus, the extended repetition of "chicken" was intended more as a playful response rather than a serious attempt at conveying the concept you requested.

> There's no thinking, no reasoning, no calculation, no logic, no deduction, no intelligence, no anything. It's only token, token, token.

False. The neural network inside the transformer LLM contains a hierarchical semantic model, and has inferred some rules of reasoning from the training set. It can apply those rules to new input.

There are semantic layers above the "token token tokens".

Explore them here: https://openaipublic.blob.core.windows.net/neuron-explainer/...

What you're commenting here is simply repearting, without critical intent, the baseless claims connectionists have made about their systems for many decades. Similarlly those claims have been criticised but connectionsists simply ignore the criticisms and continue with the same old nonsense, as if nothing happened. For example, that ridiculous conceit that their systems have "neurons", or that the weights of functions in a neural net somehow represent semantic categories recognised by humans. These are all complete fantasies.

If you are not aware of the long history of debunking such fabrications, I suggest you start here:

Connectionism and Cognitive Architecture: A Critical Analysis

https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/pro...

We are talking about artificial neurons here. Not biological neurons. These are mathematical structures.

https://en.wikipedia.org/wiki/Artificial_neuron

These models infer semantic categories that correlate to categories within the human mind, to the extent that they can solve natural language understanding tasks.

No one is saying they are biological neurons, or that they model semantics exactly as the human mind would. It is mechanical pattern recognition that approximates our understanding.

You can browse those artificial neurons online and view their associations.

You're just saying words without ever explaining why. What am I supposed to do about that? There's nothing to argue with if you're just repeating nonsensical claims without even trying to support them.

For example:

>> It is mechanical pattern recognition that approximates our understanding.

That's just a claim and you're not even saying why you make it, what makes you think so, etc.

> That's just a claim and you're not even saying why you make it, what makes you think so, etc.

Mechanical - it is an algorithm, not a living being.

Pattern recognition - a branch of machine learning that focuses on the detection and identification of regularities and patterns in data. It involves classifying or categorizing input data into identifiable classes based on extracted features. The patterns recognized could be in various forms, such as visual patterns, speech patterns, or patterns in text data.

Approximates our understanding - meaning the model is not exactly the same as human understanding

When I say 'mechanical pattern recognition that approximates our understanding,' what I mean is that large language models (LLMs) like GPT-4 learn patterns from the vast amounts of text data they're trained on. These patterns correspond to various aspects of language and meaning.

For example, the models learn that the word 'cat' often appears in contexts related to animals, pets, and felines, and they learn that it's often associated with words like 'meow' or 'fur'. In this sense, the model 'understands' the concept of a cat to the extent that it can accurately predict and generate text about cats based on the patterns it has learned.

This isn't the same as human understanding, of course. Humans understand cats as living creatures with certain behaviors and physical characteristics, and we have personal experiences and emotions associated with cats. A language model doesn't have any of this - its 'understanding' is purely statistical and based on text patterns.

The evidence for these claims comes from the performance of these models on various tasks. They can generate coherent, contextually appropriate text, and they can answer questions, translate languages, and perform other language-related tasks with a high degree of accuracy. All of this suggests that they have learned meaningful patterns from their training data.

Your disagreement seems to be a philosophical one. It is not a technical argument. It seems that you won't accept that semantics can be modelled by an unconscious mechanical system. I am talking about mathematical concepts of semantics, not "true" human semantics that are the product of human insight and consciousness. https://en.wikipedia.org/wiki/Semantic_similarity

While AI doesn't have an innate understanding of the world as humans do, the semantic representations it learns from vast amounts of text data can be surprisingly rich and detailed. It can capture associations and nuances that are not immediately apparent from a purely syntactic analysis of the text.