|
|
|
|
|
by itronitron
2905 days ago
|
|
The Lucene API has a lot of language specific tokenizers and analyzers that will help normalize what a term is in the index regardless of language. You can then apply various statistical NLP methods which tend to be more language agnostic. |
|