|
My reading is that Pereira doesn't think that deep learning has quite conquered language, and in this he's in complete disagreement both with Goldberg and Le Cunn's side (who both champion deep learning for NLP and claim that it has led to great advances in the field). For me the problem with NLP and deep learning, or indeeed any empirical method, is that the evaluation metrics we have are imperfect. Take BLEU scores, from Goldberg's post, for instance. Those basically compare generated text to some arbitrary target. Originally, they were proposed as metrics of machine translation quality, so the target was some existing translation and the machine-generated translation was examined for coverage of this human-made translation. But of course, there is no principled way that we know of to choose one translation over another- or even say whether a translation is a good or bad translation, on its own. And that's true for translations by humans also. You give the same text to 10 professional translators, they'll give you 10 different translations. Then you give each of their translations to 10 readers and ask them for their opinion, and you get back 100 different opinions. The translation task itself is not even particularly well defined, exactly because there may be any number of valid translations (possibly, infinitely many) of a piece of text in another language. So, with translation, we have an ill-defined task with an arbitrary metric. And that metric of course is lifted from its original task and used to evaluate language generation and so on. Then someone comes along who knows how to train a deep net but has no idea what the purpose of their chosen metric is, or what it does and has no understanding of the task itself- and claims to have solved it because they got good results on that metric. It's a bit of a methodological mess that's not going to lead to much progress. People can keep piling on these "results" for as long as they like and pretend that they're "solving" this or that problem- but in real-world terms, nothing is really being solved at all. |