Hacker News new | ask | show | jobs
by theblackcat1002 2178 days ago
As someone who has work with all three NLP toolkit: huggingface, openmt-py and fairseq. I always have trouble juggling through the heavy abstraction of openmt-py.

For example in openmt-py you need to write fields, reader and raw datasets before you even load into their complex dataset class. Each item is heavily abstracted through several layers of classes. I understand this improve code reuse, but introduce a huge steep curve for newcomer.

Huggingface approach on the other hand is slightly more "messy" [2] but easier to understand and add your own tweak.

[1] https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/bin/p...

[2] https://github.com/huggingface/transformers/tree/master/src/...