|
|
|
|
|
by Barrin92
2124 days ago
|
|
>GPT-3 smashed them. which isn't surprising because virtually all of the questions are so simple they could literally appear in the training data that GPT-3 was trained on. I'm a little tired of proving how "intelligent" GPT is by asking these superficial questions. the MIT article gives much better examples that actually require physical, biological or higher-level reasoning and it produces complete nonsense as one would expect. |
|
As usual, Gary Marcus is absurdly biased. For example, out of the larger 157 cherry-picked examples, there is this.
> You poured yourself a glass of cranberry juice, but then absentmindedly, you poured about a teaspoon of grape juice into it. It looks OK. You try sniffing it, but you have a bad cold, so you can’t smell anything. You are very thirsty. So you drink it. It tastes a little funny, but you don’t really notice because you are concentrating on how good it feels to drink something. The only thing that makes you stop is the look on your brother’s face when he catches you.
They then consider this a failure because, I quote, there is no reason for your brother to look concerned.
This is patently ridiculous. It indicates that Gary has no idea what a language model even is. GPT-3 is not a Q&A model. It is not given a distinction between its prompt and its previous continuation. The only thing GPT-3 does is look for likely continuations. If you want GPT-3 to avoid story continuations, don't give it a story to continue! Or at least tell it what you're grading it on!
But no, as usual, to Gary, all the times we show GPT-3 making sophisticated physical and biological deductions are fake, spurious, or meaningless. [1], [2], [3], [4]; none of that is truly evidence. But an incredibly cherry-picked, unfairly marked exam where you never told the examinee what you were testing them on, and you used high-temperature sampling without best-of, so only getting half right doesn't even indicate anything anyway (and of course, let's also pretend there are as many ways to be wrong as to be right, such that we can pretend each is equal evidence)—now that's enough evidence to write a disparaging article about how GPT-3 knows nothing.
[1] https://twitter.com/danielbigham/status/1295864369713209351
[2] https://www.lesswrong.com/posts/L5JSMZQvkBAx9MD5A/to-what-ex...
[3] https://twitter.com/QasimMunye/status/1278750809094750211
[4] https://news.ycombinator.com/item?id=23990902