Hacker News new | ask | show | jobs
by andreyk 2339 days ago
The TLDR:

- there are two opposing views about nature of human intelligence, nativism (that believe a fair deal of intelligence is already encoded in us when we are born, eg we are 'primed' to learn language a certain way according to Chomsky) and empiricism (that believe we mostly learn things from scratch via experience)

- GPT 2 is a recent mega large neural net trained on lots of data to take in a few words or sentences and predict which words are most likely to come next given that input. It was trained on absurdly huge amounts of data with absurdly huge amounts of compute at a fairly large cost.

- GPT 2 is pretty impressive in many ways in that the stuff it predicts is syntactically correct, relevant to input, and very versatile (it can handle and complete text on any subject you can think of). But, its predictions often exhibit a lack of basic common sense.

- Since it lacks common sense and a ton was invested it, the piece posits it is evidence in favor of 'empirical' approaches to intelligence seemingly being wrong and alternatives being a good idea from now on.

To be fair GPT-2 does have some innate built in structure (it's not just a fully connected neural net, it has the popular Transformer architecture which relies on the fairly recent idea of self attention as a core building block). And it's fair to argue that GPT-2 is just evidence training on word prediction conditioned on input is not enough to get to common sense; perhaps a different task/loss built on top of Transformer model would work just fine. But really the whole research project of Deep Learning has been an exercise in nativism (since most research is trying to find new and better neural net architectures, ie priors for learning, for various tasks), aside from OpenAI which is much bigger on just scaling existing stuff up, so this agrees with current AI trends more or less.

1 comments

> Since it lacks common sense and a ton was invested it, the piece posits it is evidence in favor of 'empirical' approaches to intelligence seemingly being wrong and alternatives being a good idea from now on.

It is unclear to me what the distinction between an "empirical" vs not approach even means within this context.

Do nativists a la Chomsky suggest that these "language frameworks" are independent of the basic interactions of neurons in the brain?

If you view human evolution as the learning procedure for building brain structure, GPT-2 seems entirely consistent with the 'nativist' approach, no?