Hacker News new | ask | show | jobs
by paulgerhardt 2597 days ago
In my experience zh.wikipedia.org is largely authored by Taiwanese residents. The quality is high but local narratives are often written with some bias not entirely unlike what one would expect from a displaced population resulting from a fairly recent civil war.
3 comments

The English version of Wikipedia is not free of bias either. Wikipedia is not a source of truth for everything out there.
When I see that articles about the same thing in different languages state different and incompatible numbers, then I always think that this is an obvious and easily solveable problem; separate the data from the language and reference it in the text. That way each language uses the same data. This shouldn't just be done with numbers, but with dates, and other easily convertible information.

Yes, this will create conflicts as to which information is correct, but Wikipedia has had this problem since forever, and deals with it.

This is currently being done on Wikidata for information that is more easily represented in a database.

https://www.wikidata.org

https://en.wikipedia.org/wiki/Wikidata

Some infoboxes on Wikipedia have values that are automatically synced to the related Wikidata entry.

https://en.wikipedia.org/wiki/Infobox

The problem is that most data can never be verified. A source may never be fully accurate. A source could be a bunch of BS in the worst case. Even government data and media-based data frequently contradict each others.

For recent events we have already seen large press groups spreading misinformation. So when do you know when they actually produce facts or produce biased BS? At the end of the day somebody makes a call and we know nothing of their affiliations.

Wikipedia also has a notorious problem with "sources" that are written by the same person that's editing the article.
That's a horrible idea.

1. The surrounding text depends on the number it contains. By blindly replacing numbers in every language version, you get garbage like "Town A is the most populated place on the region at 10,000 inhabitants, followed by Town B at 11,000."

2. If you look at multiple language versions and they disagree, you know one of them must be incorrect and you should watch out for bias and outdated information. Forcing them to all have the same data takes away that feature.

3. Bias is rarely a problem with objectively checkable data. When the PRC publishes an encyclopedia, the issue with it is not that they'd get the date of Tiananmen Square wrong.

4. Requiring people to use and read some sort of placeholders instead of ordinary text greatly increases the barrier to entry.

I only use Wikipedia as a reference guideline for specific subjects I am not knowledgeable in. It does a fairly good job at providing that kind of information.
Very few Taiwanese today associate themselves with mainland. Less than 10% percent of the population were refugees two decades ago, and it is even less today, with second generation mainlanders being more or less assimilated.
It's not just zh / en problem. Probably it's every language problem.

Out of fun I compare some Russian vs English articles on controversial topics (e.g. Stalin / Nicholas II / etc). It's really interesting to see how it's different. In most cases it matches to the traditional point of view in native speaker community.