| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by akie 1590 days ago
	We're using libraries like this to try to guess the language of a book based on title alone (in case no other information is readily available), and trigram-based algorithms get it wrong often enough for it to be noticeable. I will look into replacing our current library with this one, it seems better suited for the task at hand.

1 comments

Yeah, language detection on short texts is quite complex. In my practice, N-grams doesn’t work well for them.