|
|
|
|
|
by bravura
4787 days ago
|
|
The Stanford parser is particularly slow --- it's in java, and it's written for research more than anything. The choice of language is not what causes the slowness of the Stanford parser. It's the choice of search strategy, which trades-off speed for accuracy. Shift-reduce dependency parsers are linear time, and are giving state-of-the-art results. No, this is incorrect. The choice of parsing logic (shift-reduce, dependency) and the search strategy (greedy, sometimes erroneously called "deterministic") are orthogonal. It's the greedy search strategy that leads to linear time performance. The choice of logic determines the lower-bound (best-case) on parsing complexity. If you do exhaustive search for the exact solution of a shift-reduce dependency parser, it is worst-case exponential. In practice, you don't do exact search, and by using a beam search approximation you can get observed linear-time performance. [edit: You can read my thesis if you are not familiar with what a parsing logic is.] I am not aware of state-of-the-art results from greedy shift-reduce parsers. Do you mind sharing? |
|
Even amongst chart parsers it's not a very quick implementation.
> The choice of parsing logic (shift-reduce, dependency) and the search strategy (greedy, sometimes erroneously called "deterministic") are orthogonal. It's the greedy search strategy that leads to linear time performance.
They're really only conceptually orthogonal, because in practice once you choose shift-reduce you're always going to choose greedy/deterministic/whatever search. I'm not aware of any shift-reduce/transition-based parsing results that don't use 1-best or beam search.
> I am not aware of state-of-the-art results from greedy shift-reduce parsers. Do you mind sharing?
Zhang and Nivre (2011) http://www.sutd.edu.sg/cmsresource/faculty/yuezhang/acl11j.p... 93.5 UAS on Stanford basic dependencies. This is the best published result on the dataset, and the best published dependency parsing result for English.
Probably the parser is still worse than the C&J reranking parser, when 200 parses are supplied to the reranker. But I'd be very surprised if it wasn't better than the Stanford parser, and probably also the Berkeley parser.