Hacker News new | ask | show | jobs
by yorwba 1592 days ago
With an ngram-based model like this one, you can just feed it short substrings, since it doesn't take the larger context into account anyway. There'll be some problems at the boundary, because e.g. "as" is a word in both languages.