Hacker News new | ask | show | jobs
by DFHippie 2919 days ago
Or Welsh. You want a canonical form for "wnaethpwyd"? Try "gwneud"! Some regexes and an exceptions list isn't going to cut it.

The more a language needs a lemmatizer for NLP the harder it is to write it.

1 comments

Could someone explain what's hard in particular about canonical forms in Welsh?