Hacker News new | ask | show | jobs
by doubtfuluser 1590 days ago
How does it actually compare to fasttext [1] in performance. Building an interface to that in GO shouldn’t be too complicated. The claim that all language identification (lid) relies on ngrams is bold and there has been a switch to pure neural network based approaches.

[1] https://fasttext.cc/docs/en/language-identification.html

2 comments

You can try with dataset that I’ve used to evaluate fasttext against different language detectors - it’s linked to this blog post: https://alexott.blogspot.com/2017/10/evaluating-fasttexts-mo...

I’ll try to find time to do it myself, but most probably only tomorrow

I've compared the Python implementation of Lingua with fasttext. Lingua performs clearly better. Look here: https://github.com/pemistahl/lingua-py#4-how-good-is-it