| We've been using spaCy a lot for the past few months. Mostly for non-production use cases, however, I can say that it is the most robust framework for NLP at the moment. V3 added support for transformers: that's a killer feature as many models from https://huggingface.co/docs/transformers/index work great out of the box. At the same time, I found NER models provided by spaCy to have a low accuracy while working with real data: we deal with news articles https://demo.newscatcherapi.com/ Also, while I see how much attention ML models get from the crowd, I think that many problems can be solved with rule-based approach: and spaCy is just amazing for these. Btw, we recently wrote a blog post comparing spaCy to NLTK for text normalization task: https://newscatcherapi.com/blog/spacy-vs-nltk-text-normaliza... |
The conclusion I came up with:
"A few notes on my Spacy NER accuracy with "real world" data
Low accuracy with sentences without a proper casing
1. Low accuracy overall, even with a large model
2. You'd need to fine-tune your model if you want to use it in production
3. Overall, there's no open-source high accuracy NER model that you can use out-of-a-box"