|
|
|
|
|
by SixSigma
3709 days ago
|
|
Apparently it was a surprise to the AI NLP teams that spent years doing manual classification, suddenly a Deep NN out performed them without any prior knowledge. Just make a 300 dimension vector of the occurrence frequencies of word combinations and out fall the rules of language! |
|
Similar techniques were well known and used for years in NLP. E.g. Brown clustering has been used since the early nineties and have been shown to improve certain NLP tasks by quite an amount. NMF also been used for quite some time to obtain distributed representations of words. Also, many of the techniques used in NLP now (word embeddings, deep nets) have been known for quite a while. However, the lack of training data and computational power has prevented these techniques from taking off earlier.
Just make a 300 dimension vector of the occurrence frequencies of word combinations and out fall the rules of language!
The 'rules of language' don't just fall out of word vectors. They fall out of embeddings combined with certain network topologies and supervised training. In my experience (working on dependency parsing), you also typically get better results by encoding language-specific knowledge. E.g. if your language is morphologically rich or does a lot of compounding, the coverage of word vectors is going to be pretty bad (compared to e.g. English). You will have to think about morphology and compounds as well. One of our papers that was recently accepted at ACL describes a substantial improvement in parsing German when incorporating/learning explicit information about clausal structure (topological fields).
Being able to train extremely good classifiers with a large amount of automatic feature formation does not mean that all the insights that were previously gained in linguistics or computational linguistics is suddenly worthless.
(Nonetheless, it's an exciting time to be in NLP.)