|
|
|
|
|
by xigency
3692 days ago
|
|
Really, the mechanism of all these parsers, including SyntaxNet, is the same in that they use statistical training data to set up a neural network. Here's a paper on the Stanford CoreNLP parser, which you can compare with Google's paper: http://cs.stanford.edu/people/danqi/papers/emnlp2014.pdf So, really all of the above parsers are weak in that they only output a single best parsing, when in reality sentences can have more than one valid structure, the principal example being the second sentence you've provided. I don't think Google's model has a better sense of humor than the others, no. I anticipate that they all have used relatively similar training data. However, there is probably a trivial way to get the second sentence to parse as Subject --- Verb --- Object
Noun Verb Article Noun
| \ | | |
Fruit flies like a banana .
and that is to provide training data with more occurrences of ... > N{Fruit flies} V{like} honey.
> N{Fruit flies} V{like} sugar water.
than occurrences of > A plane V{flies} PREP{like} a bird.
The more sentences using simile that the parser finds, the less likely the neural net is to consider 'like' as a verb. It's also impacted by all of the uses of [flies like].That's the nature of statistical language tools. The stock parser debuted here gives the same answer as CoreNLP, by the way. flies VBZ ROOT
+-- Fruit NNP nsubj
+-- like IN prep
| +-- banana NN pobj
| +-- a DT det
+-- . . punct
So much for Parsey McParseface's sense of humor. |
|