|
|
|
|
|
by famouswaffles
1117 days ago
|
|
>LLM has a good memory. LLM is told (or can infer through relevant information like keywords: "diamond axe") that it is in a minecraft setting. It then looks up a compressed version of a player's guide that was part of its training data. It then uses that data to execute goals. What about any of what you've just said screams parrot to you ? I mean here is how the man who coined the term describes it. A "stochastic parrot", according to Bender, is an entity "for haphazardly stitching together sequences of linguistic forms … according to probabilistic information about how they combine, but without any reference to meaning." So..what exactly from what you've just stated implies the above meaning ? |
|
>>LLM has a good memory.
Pretty much this.
> the man
The woman. Bender is a woman. In fact, 3 of 4 of the authors are woman and the 4th has unknown identity.
> according to probabilistic information about how they combine, but without any reference to meaning.
This is the part. I don't think the analogy of the parrot is particularly apt because we all know that the parrot doesn't understand calculus but is able to repeat formulas if you teach it. But we have to realize that there are real world human examples of stochastic parrots, and these are more akin to LLMs. If you don't know the phrase "Murry Gelman Amnesia" let me introduce you to it[0]. It is the concept that you can hear a speaker/writer talk about a subject matter you're familiar with, see them make many mistakes, then when they move to a subject matter you are not familiar with you trust them. We can call this writer or speaker a stochastic parrot as well since they are using words to sound convincing but they do not actually know the meaning behind the words. It is convincing because it matches the probabilistic information that a real expert may use. The difference is in understanding.
But this gets us to a topic at large that is still open: what does it mean to understand? We have no real answer to this. But a well agreed upon part of the definition is the ability to generalize: to take knowledge and apply it to new situations. This is why many ML researchers are looking at zero-shot tasks. But in the current paradigm this term has become very muddied and in many cases is being used incorrectly (you can see my rants about code generation with HumanEval or how training on LAION doesn't allow for zero shot COCO classification).
For specifically this work, we need to evaluate and think about understanding carefully. The critique I am giving is that people are acting as this is similar "understanding" to how we may drop a 10 year old into Minecraft and that 10 year old can figure out how to play the game despite never hearing about the game before (though maybe has played games before. But Minecraft is also many kids "intro game"). This is clearly not what is happening with GPT. GPT has processed a lot of information on the game before entering its environment. It has read guides of how to play, how to optimize game play, it has seen images of the environment (though this version doesn't use pixel information), and has even read code for bots that will farm items. The prompts used in this work tell GPT to use Mineflayer. They also tell it things like that mining iron ore gets you raw iron and several other strong hints of how to play the game. Chain of Thought (CoT) prompts also bring into doubt the understanding nature of a LLM, and really provide a strong case against understanding (since this is something an understanding creature considers). CoT is adding recurrent information into the bot and this causes statistical (Bayesian) updates. This is not dissimilar from allowing you to reroll a set of dice while also being able to load the dice. You can argue that CoT is part of the thought process for an entity that understands things, but need to recognize that this is not inherit to how GPT does things. You may want to draw an analogy to when teaching a child something and they confidently spit out the wrong answer and then you say "are you sure?" but we need to be careful to draw these parallels and think very nuanced and carefully. The nuance is critical here.
But I want to give you some more intuition into this understanding idea. We attribute understanding to many creatures and I'll select a subset that is more difficult to argue against: mammals and birds. While they don't understand everything at the level of humans, it is clear that there are certain tasks they understand, being able to use tools, quickly adapt to new novel environments, and much more. But there's a key clue here about something, we know that they can all simulate their environments. How? Because they dream. I can't help but think this is part of the inspiration for Philip K Dick naming his book that way, since this is question we're getting at is part of its central theme. But as for GPT, it isn't embodied. It does not seem to be able to answer questions about itself and it has show clear difficulties in simulating any environment. While it can make some hits, it makes more misses.
TLDR: see this prompt and ChatGPT's response: https://i.imgur.com/sK4pLw0.png
[0] https://www.epsilontheory.com/gell-mann-amnesia/
Fwiw: Bard answers similarly to ChatGPT: https://i.imgur.com/CmWsf9X.png https://i.imgur.com/QJXIBDl.png https://i.imgur.com/zSGjYss.png
Side Note: I'm even often critical of Bender myself. I think she is far too harsh on LLMs and is promoting dommerism that isn't helpful. But this has nothing to do with the meaning of Stochastic Parrot. We should also recognize that the term has changed as it has entered the lexicon and adapted. Just like every other word/phrase in human language.