|
|
|
|
|
by internet_points
394 days ago
|
|
Finnish would like a word. Take a random noun like kauppa "shop". It has at least 6000 forms: https://flammie.github.io/omorfi/genkau3.html and that's excluding compounds (written as one word in Finnish) like "bookshop" or "shop-manager" etc. etc. And then you have loan words and slang, derivations into other words classes; all of this is impossible to compactly represent in a full-form word list. Now consider the many other languages of that family ( https://en.wikipedia.org/wiki/Uralic_languages ) – they also have this extreme potential for inflections, but next to no online resources to train language models on or even scrape decent wordlists from. |
|