|
|
|
|
|
by butterm
3534 days ago
|
|
Yup, I know about that but Displacy is just so much more beautiful. Also, while NLTK's basic Twitter tokenizer is okay, I find that ARK's tokenizer [0] is much better. Similarly, for POS tagging of tweets, I am using the GATE POS tagger [1]. They have a Stanford model and I can hook it up with NLTK using the StanfordTagger class. In fact, this is the kind of integration that I am missing in Spacy. [0] https://github.com/myleott/ark-twokenize-py
[1] https://gate.ac.uk/wiki/twitter-postagger.html |
|