Hacker News new | ask | show | jobs
by abernard1 2157 days ago
> And yet nobody cares because it isn’t full blown AGI. That’s not the point. The point is that we are getting unintuitive and unexpected results.

I don't think these are unintuitive or unexpected results. They seem exactly like what you'd get when you throw huge amounts of compute power at model generation and memorize gigantic amounts of stuff that humans have already come up with.

A very basic Markov model can come up with content that seem surprisingly like a human would say. If anything, what all of the OpenAI hype should confirm is just how predictable and regular human language is.

5 comments

> They seem exactly like what you'd get when you throw huge amounts of compute power

I disagree with that.

The one/few shot ability of the model is much much better than what I would have imagined, and I know very few people in the field that saw GPT-3 and were like "yep, exactly what I thought".

> A very basic Markov model can come up with content that seem surprisingly like a human would say.

This is false. Natural language involves long-term dependencies that are beyond the ability of any Markov model to handle. GPT-2 and -3 can reproduce those dependencies reliably.

> If anything, what all of the OpenAI hype should confirm is just how predictable and regular human language is.

Linguists have been trying to write down formal grammars for natural languages since the 1950s. Some of the brightest people around have essentially devoted their lives to this task. And yet no one has ever produced a complete grammar of any human language. So no, human language is not predictable and regular, at least not in any way that we know how to describe formally.

W.r.t. the Markov model, I just mean that something even that trivial can sound lifelike. It's not surprising that throwing billions of times more data at the problem with more structure can make the parroting better.

> So no, human language is not predictable and regular, at least not in any way that we know how to describe formally.

I don't know what to say about this other than perhaps the NLP community has been a little too "academic" here and I disagree.

Grade schoolers routinely are forced to make those boring diagrams for their particular language, and that has tremendous structure. When you add that structure (function) with the data of billions of real-world people talking, it's not surprising that the curve fit looks like the real thing. Given how powerful things like word2vec have been that do very, very simple things like distance diffs between words, it's not surprising to me that the state of the art is doing this.

It is surprising! You could throw all the data of the entire human race at a Markov model and it would not sound a tenth as good as even GPT-2. Transformers are simply in a new class.
Were you alive in 2010?
Right...but at the end of the day that's what intelligence is. You are just an interconnected model of billions of neurons that has been trained on millions of facts created by other humans. Except for this model can vastly exceed the amount of factual knowledge that you could possibly absorb over your entire lifetime.
> You are just an interconnected model of billions of neurons that has been trained on millions of facts created by other humans.

...but I didn't pop out of the womb that way, and as you said, over my lifetime I will read less than 1 millionth of the data that GPT-3 was trained on. GPT-2 had a better compression ratio than GPT-3, and I'm sure a GPT-4 will have a worse compression ratio than GPT-3 on the road we're on.

Rote memorization is hardly what I'd call intelligence. But that's what we're doing. If these things were becoming more intelligent over time, they'd need less training data per unit insight. This isn't a dismissal of the impressiveness of the algorithms, and I'm not suggesting the classic AI effect "changing the goalposts over time." I fundamentally believe we're kicking a goal in our own team's net. This is backwards.

Exactly. Even gpt3 is not creating new content. It is just permuting ecisting content while retaining some level of coherence. I don't reason by repeating various tidbits I've read in books in random permutations. I reason by thinking abstractly and logically, with a creative insight here and there. Nothing at all like a Markov model trained on a massive corpus. Gpt3 may give the appearance of intelligent thought, but appearance is not reality.
> I don't reason by repeating various tidbits I've read in books in random permutations.

Are you sure?

Yes, I would fail any sort of math exam if I used the GPT-3 model.
GPT-3 is nothing like a Markov model.
Same sort of generative probabilistic model idea.
All creative work is derivative.
Not all derivative work is creative.