Hacker News new | ask | show | jobs
by cbr 3691 days ago

    better at parsing the Penn Treebank than the best
    natural language parser for English on the Wall
    Street Journal
I'm pretty sure "the 20 year old Penn Treebank" and "the Wall Street Journal" are referring to the same dataset here. In the early 1990s the first large treebanking efforts were on a corpus from the WSJ, and they were released as the Penn Treebank: https://catalog.ldc.upenn.edu/LDC95T7 People report results on this dataset because that's what the field has been testing on (and overfitting to) for decades.

(I worked on a successor project, OntoNotes, that involved additional treebank annotation on broader corpora: https://catalog.ldc.upenn.edu/LDC2013T19)

1 comments

Yes, the press release is (actually) pretty difficult to parse and really opaque in how the comparison is measured, which is why I wanted to throw into question the blog's headline, "The World's Most Accurate Parser." It seems more clear now but obviously Google doesn't feel the need to overtly prove that they are the best in the world at tasks, which is a bit questionable considering their number of followers. In all, it seems they have tested against several other dependency parsers, but clearly not all of them, and it's fair to say that it is "highly accurate," but this parser still falls victim to some of the same issues that most statistical parsers do, and while faster than some dependency parsers, it is not faster than all of them.

The point about overfitting is valid, too, which is another reason why this "most accurate such model in the world" claim is obnoxious.

It's also fair to note that their advance is in fractions of percentage points on this specific dataset over models that are 5-10 years older.