Hacker News new | ask | show | jobs
by vkhuc 4256 days ago
The whole project is based on various libraries. In particular, the POS tagger itself uses the OWLQN optimizer from Stanford NLP (licensed under GPL).

However, it's possible to remove GPL libraries out of the POS tagger as mentioned here: https://github.com/brendano/ark-tweet-nlp/blob/master/LICENS...

1 comments

You may want to look at Factorie (https://github.com/factorie/factorie), that has a decent POS tagger and it's not crippled by the license. It also has dependency parsing which works reasonably well.
I've been looking at Factorie for a while but haven't actually done anything heavy with it.

I planned to replace the optimizer in CMU's POS tagger with the one implemented in OpenNLP to make the tagger fully Apache. Unfortunately, so busy right now. Currently, I'm running the tagger on AWS, so the GPL doesn't hurt me much.

BTW, besides the POS tagger, CMU's TweeboParser depends on Turbo Parser which again is licensed under GPL.