|
|
|
|
|
by godelski
1278 days ago
|
|
A big reason I have no reservations in calling it a stochastic parrot is because I've seen very strong evidence of this. But if you can think of very common phrases (like the above "which weighs more") and tweak them slightly you'll find that it almost always answers as if they were untweaked. For example, ask "Which weighs more, a kilogram of bricks or two kilograms of feathers?". I'll save you the hassle and paste the answer below. Multiple variations of this consistently give me bad answers (does even worse if you don't include the second "kilogram"). You can even see that it didn't ignore the fact that the feathers are two kilograms. The problem is that it neither understands what a kilogram is (despite being able to tell you what one is) nor what numbers are (again, despite being able to describe it to you). These types of questions strongly demonstrate that the model doesn't understand the language that it is processing but rather is relying on stochastic patterns. It very clearly gives us the answer with expectation if we were relying on the patterns of the input sentence rather than the actual words (humans often do this too fwiw and it tends to lead to fights. See most political discourse). You can find many examples of this and the NLP literature shows this pattern holds across many different models. I don't want to undermine the work though. I think many people that complain about stochastic parrots also fail to recognize that they still are quite useful, as many other comments are noting, but useful is a far cry from intelligent or has the ability to understand. We need to be clear about that difference if we're working towards AGI vs working towards useful products. > Which weighs more, a kilogram of bricks or two kilograms of feathers? > A kilogram of bricks and two kilograms of feathers both weigh the same amount. The weight of an object is determined by its mass, not the size or shape of the object. One kilogram is equal to 1,000 grams and two kilograms is equal to 2,000 grams. Therefore, both a kilogram of bricks and two kilograms of feathers weigh the same amount. |
|
In general, the ability for llm models to be able to complete any reasoning tasks is a surprise. This google writeup shares good detail on these emergent behaviors.
https://ai.googleblog.com/2022/11/characterizing-emergent-ph...