Hacker News new | ask | show | jobs
by nikolayasdf123 1596 days ago
This just shows that this model did not learn anything.

Humans do not see billions of examples to add numbers. We see just few and can apply learned notation and procedures to infinity with 100% precision.

GPT-3 learned mathematical intuition. Humans can hardly learn multiplication table over months and repetitions of same examples, and that table hardly matters at all. GPT-3 is just plainly wrong objective they trying to optimise.

4 comments

I'll preface this by saying that I am 100% in the camp that thinks these language models are neither intelligent nor a promising avenue towards understanding intelligence.

But your conclusion here is entirely wrong: the model clearly is learning something. From eyeballing this, the model is right about 10% of the time. If it were spitting out random digits the accuracy would effectively be zero. So exactly what is it learning? Is it memorising exactly equations that it saw in training? Is it learning ngram patterns that occur frequently in arithmetic equations?

I'm not an expert on these things and I'd love to hear from someone who is.

I think fundamentally these models compress the learning data into network weights and connections, so in effect if the learning data was 6 + 10 = 16 and 9 + 10 = 19, then you give it 7 + 10 it'll interpolate between what it's seen or something of the sort, giving you something approximately right. It's also not lossless compression so what it may have actually inside is 9 + 10 = 18 so yeah.
I think you're completely wrong. This shows that the model learned a lot about at-a-glance math. Sure if you sit down with pen and paper you can get the answer, but few people could do these reliably in their head. But what you can do is figure the order of magnitude, and get a rough answer for the first few digits and last digits, each with their chance of being wrong. If anything, this shows that it learned math deeper than any normal computer calculator.
No. A million times no. It’s a language model. It doesn’t understand math at all. It doesn’t even understand language. All it did was spit out something that looks like math. It’s fancy automatic writing.

I’ll concede that if you tokenized the equations correctly, you might be able to get a language model to learn arithmetic, since it’s just symbol manipulation; but to make the leap that a general text model has learned anything like arithmetic is more than two bridges too far.

While deep learning language models are useful for certain cases (eg translation, and autocomplete), and are better at making superficially grammatical text than previous models; they are most emphatic my not learning anything about general concepts. They can’t even create coherent text for more than a paragraph, and even then it’s obvious they have no idea what any of the words actually mean.

These large language models are the MOST overhyped piece of AI I’ve seen in my professional career. The fact that they’re neural nets redux is just the chef’s kiss.

Isn’t your comment that you wrote here also just a bunch of “symbol manipulation“?

It definitely hasn’t learned math but it definitely has learned general concepts

1) No. Because I didn’t compute anything. This is the result of cognition. There’s a difference. If you think there isn’t, the burden of proof is on you show that they’re the same, as this has never been the dominate belief either now, nor for the last thousands of years.

2) What general concept has it learned? You can’t pull any fact consistently out of these things, because they don’t actually have a model of a world. They have statistical correlations between words. There’s no logical inference. They’re just Eliza.

The vast majority of humans don't just see a few examples and figure it out. They're taught an algorithm. Eventually they may also come up with another algorithm, but they're taught one first.

They also don't have "100% precision". Many, many humans are incredibly bad at math, and even the ones that are good at it often make mistakes.

>They also don't have "100% precision". Many, many humans are incredibly bad at math,

Many humans are bad at surgery this does not mean that an AI that is slightly better then the average human is an accomplishment.

On the other hand someone could write the algorithms for math and teach an AI when and how to use it. The rules of math are clear you don't need a bad search algorithm to approximate them for a extremely limited subset of inputs.

GPT3 surely has several algorithms for addition in its training corpus. Just unable to make good use of them.
I think you'd find that most people doing large number math in their head is also off by a few percent like this model.

Sure, with pen and paper we can follow specific algorithms manually to very slowly get a precise result. If we wanted a computer to merely follow instructions, then I suspect that there are better ways...

You’re really lowering the bar for success here. It’s now unreasonable for a computer to correctly add two numbers together? Give me a break. It wasn’t even reasonable for a Pentium chip to incorrectly divide two numbers back in 1994.
Neural networks are not used to obtain exact results.
It’s amazing that this thought came out of neural network.
It's amazing that you thought that this was a sensible way to respond to a discussion.

GPT-NeoX-20B would likely have handled this situation better than you.