Hacker News new | ask | show | jobs
by umeshunni 398 days ago
I've often wondered if that's a good thing or a bad thing.

It feels like reading through Wikipedia, I'm missing some specifics, details or even points of view on a particular (international) topics when I'm reading it in English. I was reading about a town in Estonia recently while trying to track down some ancestry and while the English page had limited information, when I switched to Estonian and used google translate, I was able to find a ton of detail. I see the same when reading about smaller towns in India or non-English literature.

Would some sort of auto-translation and content augmentation (with editorial support) be useful here.

2 comments

If you speak multiple languages (or are willing to read machine translation) you can often get a much richer understanding of a topic by reading wikipedia in multiple languages. Wikipedia strives to be unbiased, but obviously that's a goal, not a reality. But different languages are biased in different directions. Even on articles of comparable length the articles often emphasize very different parts, and deem different facts relevant.

And sometimes there are facts that are just less relevant in certain languages. The English article on the model railway scale HO spends the majority of its introduction on a lengthy explanation that HO stands for "Half O", and the O scale is actually part of the set of 0, 1, 2 and 3 scale, but English speakers still use the letter O. Which is important to note in an English article, but completely irrelevant in the majority of languages that don't share this very British quirk and call it H0 instead.

Cultural diversity is a big strength of wikipedia. Turning everything into one smooshed together thing would be a travesty. Making the various different versions more accessible to readers would be helpful, but it would also dilute the diversity as it would certainly bring more editors of one language into other language versions of the same article, leading to more homogenized viewpoints and a view that's even more dominated by the most active wikipedians (presumably Americans and Germans)

> Would some sort of auto-translation and content augmentation (with editorial support) be useful here.

The Wikipedia folks are working on this, but planning on auto-generating natural language text from some sort of language-independent semantic representation. This avoids the unreliability of typical machine translation and makes it comparatively easier to have meaningful support for severely under-resourced languages, where providing good encyclopedic text is arguably most important. Details are very much TBD but see https://meta.wikimedia.org/wiki/Abstract_Wikipedia for a broader description. If you prefer listening to a podcast episode, see e.g. https://www.youtube.com/watch?v=a57QK4rARpw