Hacker News new | ask | show | jobs
by wilde 1445 days ago
The point is that there are lots of humans who speak these languages and use tech. They just don’t use Wikipedia so getting a good translation corpus going was harder.
1 comments

And it's both cumulative across all those languages (see above), cheap/amortized (if you can do a good multilingual NMT for 50 languages, how hard can 50+1 languages be?), and many of those languages are likely to grow both in terms of sheer population and in GDP. (Think about South Asian or African countries like Indonesia or Nigeria.) The question isn't why are FB & Google investing so much in powerful multilingual models which handle hundreds of languages, but why aren't other entities as well?
what other entities would really have access to the text resources that FB & Google? outside of a few other large companies I can't imagine many