Hacker News new | ask | show | jobs
by ohitsdom 3692 days ago
I'm sure it's only a matter of time before someone puts this online in a format easily played with. Looking forward to that
2 comments

It's already available here - https://github.com/tensorflow/models/tree/master/syntaxnet

    echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh

    Input: Bob brought the pizza to Alice .
    Parse:
    brought VBD ROOT
     +-- Bob NNP nsubj
     +-- pizza NN dobj
     |   +-- the DT det
     +-- to IN prep
     |   +-- Alice NNP pobj
     +-- . . punct
I mean fully online, where I don't have to download and setup tensorflow.
Yes, probably a few days, unless you go through the effort to setup a web server.

For other non-Parsey McParseface dependency parsers and POS taggers that are web accessible, see http://corenlp.run/ and http://nlp.stanford.edu:8080/parser/.

Does Google's have a better sense of humor than the 3 in this thread? They all fail on:

Time flies like an arrow. Fruit flies like a banana.

Really, the mechanism of all these parsers, including SyntaxNet, is the same in that they use statistical training data to set up a neural network. Here's a paper on the Stanford CoreNLP parser, which you can compare with Google's paper: http://cs.stanford.edu/people/danqi/papers/emnlp2014.pdf

So, really all of the above parsers are weak in that they only output a single best parsing, when in reality sentences can have more than one valid structure, the principal example being the second sentence you've provided. I don't think Google's model has a better sense of humor than the others, no. I anticipate that they all have used relatively similar training data.

However, there is probably a trivial way to get the second sentence to parse as

      Subject --- Verb --- Object
     Noun       Verb   Article  Noun
      |   \       |     |        |
    Fruit flies  like   a      banana .
and that is to provide training data with more occurrences of ...

  > N{Fruit flies} V{like} honey. 
  > N{Fruit flies} V{like} sugar water.
than occurrences of

  > A plane V{flies} PREP{like} a bird.
The more sentences using simile that the parser finds, the less likely the neural net is to consider 'like' as a verb. It's also impacted by all of the uses of [flies like].

That's the nature of statistical language tools.

The stock parser debuted here gives the same answer as CoreNLP, by the way.

    flies VBZ ROOT
     +-- Fruit NNP nsubj
     +-- like IN prep
     |   +-- banana NN pobj
     |       +-- a DT det
     +-- . . punct
So much for Parsey McParseface's sense of humor.
I have a visualizer for CoreNLP that's OSS, would be easy to adapt: http://nlpviz.bpodgursky.com/
Thank you for sharing this. Do you think it can handle multiple parses ?