Hacker News new | ask | show | jobs
by ryanmim 4174 days ago
This is a pretty good explanation of why almost all practical applications of NLP are now accomplished by statistics rather than fancy linguistic grammar models you might have read about in a Chomsky book.

Old school NLP has always fascinated me though, and I'm pretty excited about what might be possible in the future by using more than purely statistical methods for accomplishing NLP tasks. Maybe the author could have speculated more wildly in his prognostication ;)

2 comments

It's important to make a distinction between (i) Chomskyan linguistics, (ii) 90s style symbolic systems, (iii) 90s/early 2000s style statistical systems and (iv) 2010s style statistical systems.

Chomskyan linguistics assumes that statistics and related stuff is not relevant at all, and that instead you need to find the god-given (or at least innate) Universal Grammar and then everything will be great. 90s style symbolic systems adopt a more realistic approach, relying on lots of heuristics that kind of work but aim at good performance rather than unattainable perfection; 90s style statistical models give up some of the insights in these heuristics to construct tractable statistical models.

If you look at 2010s style statistical models, you'll notice that machine learning has become more powerful and you can use a greater variety of information, either using good linguistic intuitions (which help even more with better learning algorithms, but require a certain expressivity as well as some degree of matching between the way of constructing the features and the classification) or unsupervised/deep-NN learning, which constructs generalizations over features.

The main reason that you won't ever see people talking about systems with great machine learning and great linguistic intuitions is that you normally want to treat one of them as fixed and focus on improving the other, i.e., it's more a practical/cultural difference than an actual limitation.

Actually this isn't true, wrt Chomsky. Chomskyan linguistics assumes statistics is very important (and this has been noted by Chomsky himself since at least the early 1960s). Chomsky simply argues that statistics is insufficient on its own. And in truth, most NLPers believe this, but they rarely admit it. Most/all NLP projects have some form of "universal grammar", tho usually its something like a regular grammar (~ a Markov chain) or at best a probabilistic CFG (PCFG). I suspect the reason is that, to some extent, hierarchical structures like this seem so natural that its hard to imagine what else you could do, so there's a tendency to co treat CFGs as not even a grammar choice, but it is. There are other kinds of grammars (such as pregroup grammars) which lack these notions of hierarchy but work perfectly well for the same domains as CFGs, just in very different ways.
Quoting a symposium with Chomsky talking about statistical AI: http://languagelog.ldc.upenn.edu/myl/PinkerChomskyMIT.html

"I think there have been some successes, but a lot of failures. The successes, to my knowledge at least, are those that integrate statistical analysis with some universal grammar properties, some fundamental properties of language; when they're integrated, you sometimes get results [...] On the other hand there's a lot work which tries to do sophisticated statistical analysis, you know bayesian and so on and so forth, without any concern for the uh actual structure of language, as far as I'm aware that only achieves success in a very odd sense of success. There is a notion of success which has developed in computational cognitive science in recent years which I think is novel in the history of science. It interprets success as approximating unanalyzed data."

The model that has become dominant in statistical AI -- positing a basic grammar that is strongly underconstrained and eliminating spurious analyses not through "universal grammar" (i.e. presupposed innate structures) but through learned parameters, would be something that Chomsky has been very much against; Simultaneously, work that models grammar with enough precision that you could derive predictions from it (e.g. Ed Stabler's grammar implementation) are seen as nice-to-have but not central to the undertaking of generative grammar.

And I think Chomsky put his thumb right on the difference in goals: Chomsky defines progress in linguistics as work that posits the right ("universal") structures, and argues that these are cognitively real and innate, whereas statistical AI is more interested in predicting useful things with structures that may or may not correspond to anything that is cognitively real.

To people nowadays, the whole notion of constrained "universal" models with few statistics versus underconstrained "statistical" models seems to be a very minor one, since today's statistical models have a lot of structure, and people doing generative grammar aren't totally opposed to using statistics or optimality theory to select most-plausible structures. But, back in the day, when the most expressive statistical models people used were HMMs [hidden Markov model - a probabilistic regular grammar] and PCFGs [probabilistic context-free grammars], the gap was much wider, whereas nowadays the models are a bit more similar while the goals are still different.

Well, if you'd like to know what I think we'll be doing in the future, check out the rest of the site. :p

But: I'm building an SDK for conversational AI (think Siri, in any app, and 10 times better), that's what the site as a whole is for. I think in 5 years it'll be pretty commonplace to have fairly natural, Jarvis-like conversations with computers, and within 10 years we'll have R2D2/C3PO robots.

Ya I checked out your root project, will give it a whirl when you open it up. I'm moderately interested in adding voice commands to an app I'm working on and haven't found a service that fits the bill yet.
Well let me know what sorts of things you have in mind, and what you can't find in other services! I'll see if there's something LE can do for you, or could do in the near future. I'm always in the irc channel (#languagengine on freenode), so feel free to drop by. :)