Hacker News new | ask | show | jobs
by DoingIsLearning 983 days ago
If it's training on available online corpus then it will go quickly mostly for English and Mandarin.

Most countries' classic texts and books are still undigitized sitting in Libraries and public archives.

Also book publishing market and online publishing are proportional to total population, smaller country means less content.