|
|
|
|
|
by desmap
2060 days ago
|
|
What would I miss if went all transfomers without spaCy? I don't get the idea of a wrapper API through spaCy. I'd like to be as close as possible to the core transformers API without any intermediate layers. Nothing against spaCy but also when looking at huggingface's side and all the pre-trained models... it feels that nobody talks about/uses spaCy if they use transformers already. |
|
spaCy's Doc object is pretty helpful for using the outputs, for instance you can iterate over the sentences and then iterate over the entities within each sentence, and look at the tokens within them, or get the dependency children of the words in the entity. The Doc object is backed by Cython data structures, so it's more memory efficient and faster than Python equivalents you'd likely write yourself.
I also think our pipeline stuff is a bit more mature than the one in transformers. The transformers pipeline class is relatively new, so I do think our Language object offers a better developer experience.
I think the new training config and improved train command will also be appealing to people, especially with the projects workflow.
The improved transformers support in v3 is very new, it's only just released in beta form. I do hope people find it useful, but of course no library or solution is ideal for every use-case, so I definitely encourage people to pick the mix of libraries that seems right to them.