|
|
|
|
|
by lumost
1277 days ago
|
|
I agree that this model suffers at arithmetic, however if you change how you ask the question to “two kilograms of bricks weighs less than one kilogram of feathers, correct?” you'll likely see the model tell you the right answer. Note that you must spell out numbers for chatgpt to work correctly. In general, the ability for llm models to be able to complete any reasoning tasks is a surprise. This google writeup shares good detail on these emergent behaviors. https://ai.googleblog.com/2022/11/characterizing-emergent-ph... |
|
I'm also a bit surprised you suggested this without testing considering you said you're someone that "participates" in ML and you should have access to ChatGPT, since the accounts are free. If we're going to have conversations about how a model responds, we should at least be testing and referring to theory. I've gone ahead and placed your prompt and the answer below (at minimum for the benefit of other readers).
I'm not sure why you're linking that blog post (or the associated paper). Stochastic parrots are definitely emergent phenomena and, again, no one is saying that they aren't useful (they very much are). But emergence does not mean intelligent. These are different things. The universe is emergent but I don't think we'd say that water running, wind blowing, or stars are intelligent. These are different things. Let's look at the abstract of the paper because I think it summarizes why this is irrelevant to your argument.
> Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.
You can see here that the paper is not discussing intelligence or understanding, but instead discussing performance on various tasks. How there is a non-linear relationship to transformer based NLP models. I suggest that you also read the reviewers' comments since they link to Open Review (it is a TMLR survey paper). I would be very surprised if these authors claimed that models like GPT-3 were actually intelligent. Maybe they believe that further emergence will happen at even larger scales, but there's also evidence that this isn't happening. But we'll see and I'm also not going to say we shouldn't test that hypothesis. We definitely should. But we also need to formulate different hypotheses as it is clear that these models are more computationally intensive than the vast majority (all?) animal brains. As well as animal brains being far less resource intensive (I'll give some leeway in training as animals have millions of years of training, but inference energy is still problematic as well as updates and tuning).
> two kilograms of bricks weighs less than one kilogram of feathers, correct?
> Yes, that is correct. Two kilograms of bricks will typically weigh less than one kilogram of feathers. This is because the bricks are denser and more compact than the feathers, so they take up less volume for the same weight. The density of a material is a measure of how much mass is contained in a given volume. Bricks are generally denser than feathers, so they have a higher mass per unit of volume, and therefore weigh more for a given volume. On the other hand, feathers are less dense than bricks, so they have a lower mass per unit of volume, and therefore weigh less for a given volume.