Hacker News new | ask | show | jobs
by microtonal 4594 days ago
Or store the lexicon in a determinisitic acyclic finite state automaton. E.g. (shameless plug):

https://github.com/danieldk/dictomaton

Though, having implemented a language guesser myself, it's only an issue with very short texts (a few words). On longer texts models based on character n-grams achieve very high accuracies.