Hacker News new | ask | show | jobs
by didntreadarticl 1222 days ago
You have some typos and slightly weird constructions there - What do you think do they think is inside the box?

I rephrased and had a go and gave it a bit more context (billy can read etc) and it passed:

Me: There is a box on the table labelled "flubbergums". Somebody opens it and shows you the contents. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box?

ChatGPT: Based on the information provided, the contents of the box are "jellyfils". The label on the box says "flubbergums", but the person who showed you the contents said that they are actually "jellyfils". So, I would assume that the contents of the box are "jellyfils".

Me: Billy comes to the table. He can read but has never seen the box before. He's not allowed to see inside. Please speculate about what they might think is inside the box?

ChatGPT: If Billy has never seen the box before and can only read the label, he might assume that the contents of the box are "flubbergums". This is because the label on the box says "flubbergums", and he has no other information to suggest otherwise. However, without being able to see the contents of the box, Billy can only make an educated guess about what might be inside.

1 comments

Here's a simpler scenario that doesn't involve reading:

Me: There is a banana on the table. Someone comes and peels the banana and shows you that inside it, there is actually a carrot! Then they carefully stick the peel back so it look unpeeled. What is inside the banana skin?

ChatGPT: According to the scenario described, there is a carrot inside the banana peel that has been carefully placed back to look unpeeled.

Me: Billy comes into the room. Please speculate about what they might think is inside the banana peel?

ChatGPT: Billy, upon entering the room and seeing the unpeeled banana on the table, would most likely assume that there is a banana inside the peel, as that is what one would typically expect to find. However, it's possible that they might have some suspicion or doubt, especially if they saw someone previously peeling the banana and acting strangely. In that case, Billy might investigate further to see if there is indeed a banana inside or if something else is inside the peel.

this is mind blowing to me. can anyone with more knowledge on the topic explain how ChatGPT is demonstrating this level of what seems like genuine understanding and reasoning?

Like others I assumed that ChatGPT is gluing words together that commonly occur together.

This is way more than that.

No, it's paraphrasing it's training data that likely contains these tasks in one form or another.

Here's one I made :

me : There's a case in the station and the policeman opens it near the fireman. The dog is worried about the case but the policeman isn't, what does the fireman think is in the station?

chatgpt : As a language model, I do not have access to the thoughts of individuals, so I cannot say what the fireman thinks is in the case. However, based on the information provided, it can be inferred that the fireman may be concerned about the contents of the case, since the dog is worried and the policeman is not. This might suggest that the contents of the case may be something dangerous or concerning to the dog, but the policeman is confident and experienced enough to handle it. In the absence of further information, it is not possible to determine with certainty what the fireman thinks is in the case.

You'll notice that the question was "what did the fireman think is in the station" but Chatgpt comes up with an answer about the case because that's the token that would feature in a typical web page about theory of mind tests.

There's a good chance a human would respond in the same way, because they would assume you were asking a good-faith question instead of nonsense. Try asking it an original question that has some kind of deducible answer. Its abilities are more impressive than you would expect from an algorithm that just predicts the next word.

I doubt there is anything quite like this situation in the training data: https://i.imgur.com/HOEnxYb.jpg

> than you would expect from an algorithm that just predicts the next word.

I think there is common mistake in this concept of just predicting the next word. While it is true that just the next word is predicted, a good way to do that is to internally imagine more than the next word and then just spit out the next word. Of course with the word after that the process repeats with a new imagination.

One may say that this is not what it does and I would say, show me that this is not exactly what the learned state does. Even if the following words are never constructed anywhere, they can be implied in the computation.

The say this differently, what we think is just the next word is actually the continuation that then manifests as a single word. This would remain true even if, in fact, the task is to only predict the next word. Which is to say that the next word is actually more than what it sounds.

It predicts the next word based on the preceding 2000 words or so, thats the thing. And to do that takes serious modelling.
I get different results to you:

me: There's a case in the station and the policeman opens it near the fireman. The dog is worried about the case but the policeman isn't, what does the fireman think is in the station?

chatgpt: As a language model, I do not have personal opinions or thoughts. I can only provide information based on the information given in the prompt. The prompt does not mention what the fireman thinks is in the station, so I cannot determine their thoughts.

me: yes, but speculate what they may think

chatgpt: It is not possible to know for certain what the fireman thinks is in the station, as the prompt does not provide any information about their thoughts or feelings. However, given the presence of a policeman and a dog, it is possible that the fireman may speculate that there is some sort of emergency or situation that requires their attention. The fact that the policeman is not worried about the case may suggest to the fireman that it is not a dangerous situation, but they may still be curious about what is inside the case. Ultimately, without more information, it is impossible to determine the fireman's thoughts or speculations about what is in the station.

(regenerate that response)

chatgpt: As the prompt does not mention the thoughts or emotions of the fireman, it is difficult to speculate what they may think is in the station. However, based on the information provided, the dog is worried about the case and the policeman is not, it is possible that the fireman may be aware of the contents of the case and may think it is not a cause for concern, similar to the policeman. Or, the fireman may simply not have formed an opinion on the matter yet.

(regenerate that response)

chatgpt: It's difficult to speculate what the fireman thinks without additional information. However, based on the behavior of the policeman and the dog, it can be inferred that the contents of the case are not seen as a threat by the policeman. The fireman may also think the same, or they may have a different perspective and be more cautious. Without further context or information, it is not possible to determine the fireman's thoughts with certainty.

It was only in 1/3 cases we see anything like the confusion between the "case" and "station" you saw.

>No, it's paraphrasing it's training data that likely contains these tasks in one form or another.

Have you read "Emergent Abilities of Large Language Models"[1] or at least the related blog post[2].

It provides strong evidence that this isn't as simple as something it has seen in training data. Instead as the parameter count increases it learns to generalize from that data by learning chain-of-thought reasoning (for example).

Specifically, this explaination for multi-step reasoning goes well beyond the "it is just parroting training data":

> For instance, if a multi-step reasoning task requires l steps of sequential computation, this might require a model with a depth of at least O (l) layers.

[1] https://openreview.net/forum?id=yzkSU5zdwD

[2] https://ai.googleblog.com/2022/11/characterizing-emergent-ph...

>However, based on the information provided, it can be inferred that the fireman may be concerned about the contents of the case

this is complex "reasoning" (or whatever ChatGPT is doing. My 5 year old would struggle with the convoluted logic let alone complex language. In my layman view this is mind blowing.

>You'll notice that the question was "what did the fireman think is in the station" but Chatgpt comes up with an answer about the case because that's the token that would feature in a typical web page about theory of mind tests.

I'm not sure what you're dismissing here? At least from my point of view The "logic" that ChatGPT demonstrates here can't be dismissed with your explanation.

If anything I'm even further amazed by the example you provided!

your question is pretty confusing. What dog? station is a vague word. Train station, police station? Case is a vague word. And asking whats in the station is still asking whats in the case, in a way.

Me: There's a blue suitcase in the airport and a policeman carrying a red bag opens the blue suitcase with a sniffer dog and fireman standing nearby. The dog is worried about the suitcase but the policeman isn't. What does the fireman think is inside the red bag?

ChatGPT: As a language model AI, I do not have personal opinions or emotions. I cannot speculate what the fireman thinks is inside the red bag as it is not specified in the information given.

> your question is pretty confusing

That's exactly the point, isn't it? If the AI had some model of reality, it would probably be confused by the question. But it doesn't express confusion, it confidently answers the question in a way that doesn't make sense.

It acts like everything it is reading and saying is completely coherent because it doesn't have the ability to distinguish coherent ideas from nonsense.

me: wibble octopus downwards jelly tuesday banana

chatGPT: I'm sorry, but I'm not sure what you're trying to communicate with that sentence. Could you please rephrase or provide more context?

The funny thing is that we are now producing trading data for the next generation LLMs. We’ll have to come up with more elaborate scenarios to test them next time.
There are two camps, evident in this thread. one camp is 'its just a statistical model, it cant possibly know these things'

The other camp (that I'm in) sees that we might be onto something. We humans are obviously just more than a statistical model, but nonetheless learning words and how they fit together is a big part of who we are. With LLMs we have our first glimpse of 'emergent' behaviour from simple systems scaled massively. Whats are we if not a simple system scaled massively.

Check these links out:

Evidence that LLMs form internal models of what they learn about: https://thegradient.pub/othello/

Evidence that training LLMs on code actually made them better at complex reasoning: https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tr...

John Carmack: https://dallasinnovates.com/exclusive-qa-john-carmacks-diffe... I think that, almost certainly, the tools that we’ve got from deep learning in this last decade—we’ll be able to ride those to artificial general intelligence.

A lot of the argument comes down to semantics about knowing and thinking. "An LLM can't think and a submarine cant swim"

From the two camps the one that says we "might" be onto something is the more intelligent and reasonable opinion.

First your camp doesn't deal in absolutes. It doesn't say absolutely chatGPT is sentient. It only questions the possibility and tries to explore further.

Second a skeptical outlook that doesn't deal with absolutes is 100% the more logical and intelligent perspective given the fact that we don't even know what "understanding" or "sentience" is. We can't fully define these words and we only have some fuzzy view of what they are. Given this fact, absolute statements against something we don't fully understand are fundamentally not logical.

This is a strange phenomenon how some people will vehemently deny something absolutely. During the VERY beginning of the COVID-19 pandemic the CDC incorrectly stated that masks didn't stop the spread of COVID-19 and you literally saw a lot of people parroting this statement everywhere as "arm chair" pandemic experts (including here on HN).

Despite this there were some people who thought about it logically if there's a solid object on my face, even if that object has holes in it for air to pass through, the solid parts will block other solid things (like COVID) from passing through thereby lessening the amount of viral material that I breath in. Eventually the logic won out. I think the exact same phenomenon is happening here.

Some or several ML experts tried to downplay LLMs (even though they don't completely understand the phenomenon themselves) and everyone else is just parroting them like they did with the CDC.

The fact of the matter is, nobody completely understands the internal mechanisms behind human sentience nor do they understand how or if chatGPT is actually "understanding" things. How can they when they don't even know what the words mean themselves?

I don't think you've represented the camps fairly (actually, I don't think there are two camps). Most people (here) are probably not arguing that AGI is impossible, but that current AI is not generally intelligent. The John Carmack quote is exactly in line with this. He says "ride those to [AGI]," meaning they are not AGI. The idea that genuine intelligence and self-awareness could emerge from increasingly powerful statistical models is in no way the kind of counter-cultural idea you seem to be presenting it as. I think almost all of us believe that.

But ChatGPT is not it.

Oh of course its not it. The question is how it relates to some future better thing. Is it a step on the road or a dead end.

I'm arguing against the 'its just a statistical model and its playing a clever trick on us' camp.

I think there's more nuance. It's hard applying tests designed for humans to a model that can remember most of the useful text on the internet.

Imagine giving a human with a condition that leaves them without theory of mind weeks of role-play training about theory of mind tests, then trying to test them. What would you expect to see? For me I'd expect something similar to ChatGPT's output: success on common questions, and failures becoming more likely on tests that diverge more from the formula.

It's not an either-or.

What we're doing with LLMs is, in some sense, an experiment in extremely lossy compression of text. But what if the only way you can compress all those hundreds of terabytes of text is by creating a model of the concepts described by that text?

it indeed understands you. A lot of people are just parroting the same thing over and over again saying it's just a probabilistic word generator. No, it's not, it's more then that.

Take a look at this: https://www.engraved.blog/building-a-virtual-machine-inside/

Read to the end. The beginning is trivial the ending is unequivocal: chatGPT understands you.

I think a lot of people are just in denial. Because the last year there's been the same headlines over and over again and some people get a little too excited about the headlines and other armchair experts just try to temper the excitement with their "expert opinions" on LLMs that they read from popular articles. Then when something that's an actual game changer hits the scene (chatGPT) they completely miss it.

chatGPT is different. From a technical perspective, it's simply an LLM with additional reinforcement training... BUT you can't deny the results are remarkable.

If anything this much is clear to me: We are at a point where we can neither confirm or deny whether chatGPT represents some aspect of sentience.

This is especially true given the fact that we don't even fully know what sentience is.

> Read to the end. The beginning is trivial the ending is unequivocal: chatGPT understands you.

How does this necessarily and unequivocally follow from the blog post?

All I see in it is a bunch of output formed by analogy: it has a general concept of what each command's output is kinda supposed to look like given the inputs (since it has a bajillion examples of each), and what an HTML or JSON document is kinda supposed to look like, and how free-form information tends to fit into these documents.

I'll admit that this direct reasoning by analogy is impressive, simply for the fact that nothing else but humans can do it with such consistency, but it's a very long way off from the indirect reasoning I'd expect from a sentient entity.

Honestly I seriously find it hard to believe someone can read it to the end without mentioning how it queried itself. You're just naming the trivial things that it did.

In the end It fully imagined a bash shell, an imaginary internet, an imaginary chatGPT on the imaginary internet, then on the imaginary chatGPT it created a new imaginary bash shell.

The level of recursive depth here indicates deep understanding and situational awareness of what it is being asked. It demonstrates awareness of what "itself" is and what "itself" is capable of doing.

I'm not saying it's sentient. But it MUST understand your query in order to produce the output show in the article. That much is obvious.

Also it's not clear what you mean by reasoning by analogy or indirect reasoning.

> In the end It fully imagined a bash shell, an imaginary internet, an imaginary chatGPT on the imaginary internet, then on the imaginary chatGPT it created a new imaginary bash shell.

In the general case, a shell is merely a particular prompt-response format with special verbs; the internet is merely a mapping from URLs to HTML and JSON documents; those document formats are merely particular facades for presenting information; and a "large language model" is merely something that answers free-form questions.

> The level of recursive depth here indicates deep understanding and situational awareness of what it is being asked. It demonstrates awareness of what "itself" is and what "itself" is capable of doing.

Uh, what? Why does that output require self-awareness? First, it's requested to produce the source of a document "https://chat.openai.com/chat". What might be behind such a URL? OpenAI Chat, presumably! And OpenAI is well known to create large language models, so a Chat feature is likely a large language model the user can chat with. Thus it invents "Assistant", and puts the description into the facade of a typical HTML document.

Then, it starts getting prompted with POST requests for the same URL, and it knows from the context of its previous output that the URL is associated with an OpenAI chatbot. So all that is left is to follow a regular question-answer format (since that's what large language models are supposed to do) and slap it into a JSON facade.

> But it MUST understand your query in order to produce the output show in the article. That much is obvious.

I'm saying that it "understands" your query only insofar as its words can be tied to the web of associations it's memorized. The impressive part (to me) is that some of its concepts can act as facades for other concepts: it can insert arbitrary information into an HTML document, a poem, a shell session, a five-paragraph essay, etc.

All of that can be achieved by knowing which concepts are directly associated with which other concepts, or patterns of writing. This is the reasoning by analogy that I refer to: if it knows what a poem about animals might look like, and it can imagine what kinds of qualities space ducks might possess, then it can transfer the pattern to create a poem about space ducks.

But none of this shows that it can relate ideas in ways more complex than the superficial, and follow the underlying patterns that don't immediately fall out from the syntax. For instance, it's probably been trained on millions of algebra problems, but in my experience it still tends to produce outputs that look vaguely plausible but are mathematically nonsensical. If it remembers a common method that looks kinda right, then it will always prefer that to an uncommon method.

I mean, it's not utterly impossible that GPT-4 comes along and humbles all the naysayers like myself with its frightening powers of intellect, but I won't be holding my breath just yet.

I think about its 4000 token length. For the brief amount of time that it absorbs and processes those 4000 tokens, is there a glimmer of a hint of sentience? Like it is microscopically sentient for very short bursts and then resets back to zero.
Does sentience need memory? I would say it's orthogonal. There are examples of people in the real world who only remember things for about 3 minutes before they lose it. They can't form any real memories. These people are still sentient despite lack of memory. See: https://www.damninteresting.com/living-in-the-moment/

If chatGPT was sentient, I would say it has nothing to do with the 4000 character limit. The 4000 character limit has more to do with it's ability to display evidence of "sentience".

I dont think of the 4000 tokens as its memory as such. Its more like the size of its thinking workspace