|
|
|
|
|
by throwaway-1209
3290 days ago
|
|
The whole field of NLP and computational linguistics reminds me of that joke where a drunk is looking for his keys under a street lamp instead of where he actually lost them. This is true in particular of anything that pertains to reasoning and knowledge representation. People still are trying to "infer rules" and do logical, rather than probabilistic reasoning. I get why that is. To me though, the kind of real life reasoning that humans do seems heavily probabilistic and contextual, Bayesian almost. And there's next to no notable work going on in that direction. |
|
That is because it's very hard to collect statistics on something that you can't really quantify- meaning, in this case.
There was a thread on HN a couple of days ago about a blog post where someone was experimenting with, among other things, training an LSTM network to generate Java programs [1]. In one example, the LSTM did really well in reproducing the structure of a Java program, with import declarations, followed by a class implementing an interface with a few methods with structured comments and throws declarations and everything- and even a test!
On the other hand, this program was completely useless. From a cursory glance it would probably not even compile (e.g. it refered to undeclared variables etc). There was one method named "numericalMean()" that took a single double and returned an (undeclared) variable "sum". The class had a nonsensical name - "SinoutionIntegrator". The test was testing something called "Cosise", presumably a method- but not one defined in the class. In short- a mess.
That might sound a bit harsh, but I think it's a very good example of why statistical NLP is really bad at doing meaning: because there is nothing, not a shred, of meaning in examples of the data we use to train statistical models of language, i.e. text.
Because, you see, the relation between meaning and text (and even spoken language) is completely arbitrary. Or, to put it in another way, there are potentially an infinite number of valid mappings between structure and meaning, of which we, human beings, somehow by convention or some other crazy mechanism, have agreed to use just one. And even though the various forms language entities take (inflections etc) are used exactly to convey meaning, right, the rules of how meaning varies with structure are, again, completely independent from structure itself.
Now, we have done very well in modelling structure, from examples of it (which is what text is). But it's completely unreasonable to expect our algorithms to be able to extract meaning from it also.
And that is why people are still trying to put down the rules of meaning by hand. Because that's the only way we can think of, currently, to process meaning automatically.
________
[1] https://news.ycombinator.com/item?id=14526305