Hacker News new | ask | show | jobs
by arp242 1775 days ago
Note that Icelandic is currently not well supported either ("In progress" with 384/5000 sentences and 86% Localized). Actually, GuaranĂ­ is better supported at the moment, and quite a number of other common smaller-ish languages aren't well supported yet either such as Hebrew, Danish, and even Korean (which is not small or even small-ish at all). Some other smaller languages are, such as Breton or Irish. Overall, it's a bit inconsistent. I suppose that this is because in the end, these things depend on the number of people contributing; there's a reason Esperanto is near the top, as it has a very active community of enthusiasts who love to promote the language.
1 comments

It takes about a week to get the interface translated and to start collection, for any language with at least 5000 sentences in the public domain. I helped bootstrap Guarani and Breton and a few other languages spoken by friends of mine, but in the end, it just takes one or two people. I think in general there is a big difference in engagement if STT/ASR already exists for the language (e.g. Hebrew, Danish and Korean) and if it doesn't exist at all.