Hacker News new | ask | show | jobs
by butterm 3534 days ago
Yup, I know about that but Displacy is just so much more beautiful.

Also, while NLTK's basic Twitter tokenizer is okay, I find that ARK's tokenizer [0] is much better. Similarly, for POS tagging of tweets, I am using the GATE POS tagger [1]. They have a Stanford model and I can hook it up with NLTK using the StanfordTagger class. In fact, this is the kind of integration that I am missing in Spacy.

[0] https://github.com/myleott/ark-twokenize-py [1] https://gate.ac.uk/wiki/twitter-postagger.html