Hacker News new | ask | show | jobs
by rpedela 2948 days ago
Many NLP algorithms, especially ML ones, are actually language agnostic because the input is either tokens or characters. The part that is language dependant is tokenization, normalization, and stemming. That part can be difficult and if not done well can screw up the downstream algorithm. As with everything, there are exceptions.