Hacker News new | ask | show | jobs
by fps_doug 1488 days ago
"I don't believe it because I'm missing context" is a weak argument to make. The tweets read like the author isn't a native English speaker. The traditional vs. simplified characters accusation should be relatively easy to confirm if you want to put in at least a little effort. Looking up characters as someone who isn't familiar with them at all might be cumbersome but is absolutely possible.

But then again you wonder why such blatant mistakes would be made in the first place, if this was done by someone at least halfway professional.

2 comments

>"I don't believe it because I'm missing context"

I genuinely don't understand where the mistake is, even in in your paraphrased version which is intended to caricature. Yes, I really do think that missing context is a good reason to refrain from believing something, and you should too. I find it bizarre that that is disputed.

I actually went into a fair amount of detail about what specifically was contextually inadequate here, all of which you didn't engage with.

It's hard to understand what the tweet is suggesting unless you read the entire thread, one related twitter thread, and chase down an implied history of allegedly questionable BBC reportage the UN is complicit in, and join the author in making assumptions about what it all means. I stated all of this already.

And in additional to all the previous stuff I said, it's not obvious that a difference between traditional and simplified characters prove what you're asked to believe it proves, that it must have come from Taiwan. I'm assuming there's another thread somewhere that goes into detail about how Taiwan uses different characters in other documents, which is the basis for believing different characters here prove that its a forgery?

It's not about the effort involved in comparing the characters, it's about the underlying logic for the argument, which is assumed to have been proven elsewhere but not referenced.

> It's hard to understand what the tweet is suggesting unless you read the entire thread

You linked to one tweet, which directly started talking about the traditional vs simplified characters. I don't see how there is much context missing.

Again "I don't believe it because I'm missing context" is a perfectly legitimate way to engage with something that's missing context. You appear to have abandoned that point to instead emphasize that there is indeed sufficient context (so I guess having enough context does matter to you, after all.)

I don't know if you're confusing me for somebody else, but I didn't link to any tweets. And as I've now said twice, which has been ignored twice, the thread alludes to numerous unmade arguments about the reliability of the BBC, the UN, Le Monde, and a reporter as background to motivate the inference that their reporting is unreliable.

And it makes an assumption about what is proved or not proved by traditional vs simplified characters (different = Taiwan), and that underlying assumption isn't backed up with an argument, and there's no reason to agree with that assumption without further context. A reader is supposed to already agree that that's how it works or else go scrolling through twitter timelines and searches to find where that argument is made.

Perhaps when you ignore all of this in your reply, and remind me that it's "directly started talking about [sic]" traditional vs. simplified characters, I can repeat this all again and hope the fourth time is the charm?

> I don't know if you're confusing me for somebody else, but I didn't link to any tweets.

sigh.. Correction: the link to the tweet form the guy you replied to that you seemed to refer to.

> And as I've now said twice, which has been ignored twice, the thread alludes to numerous unmade arguments about the reliability of the BBC, the UN, Le Monde, and a reporter as background to motivate the inference that their reporting is unreliable.

I clicked the link again. I don't see any of that. It starts with claims about 1) a cursor (which I ignored) and 2) the issue with the characters, which is further elaborated on in a couple follow-up tweets. Then your comment mentions all these news outlets and I don't understand how that connects to the claims about the characters, or makes them taken out of context. It looks like a pretty stand-alone claim/issue that should be something to quickly do research on if you care about the topic, nothing taken wildly out of context.

> I can repeat this all again and hope the fourth time is the charm?

Sure, if that makes you feel better.

It's very simple:

1) software renders not Unicode characters but font glyphs

2) which font glyphs are chosen depends on many factors like installed fonts, OS, language/region settings, and so on

3) people author (and read) characters by how they look on their systems, what codepoints are used is not on anyone's mind

A differently configured system can uncover incorrect codepoint choices or rendering differences across machines, exactly what happened with the author of that tweet (supposedly living in Europe and not having the same old Windows machine as ones used in CCP apparat).

In fact, this happens all the time and is a routine headache for anyone building CJK sites viewed from different countries in the region (for example, I see some traditional Japanese characters, instead of their simplified Chinese versions, on http://cs.mfa.gov.cn/wgrlh/. Is there a hidden meaning? Is the site fake?). When it comes to MS Word and IME in old Windows versions, things are even wilder. I doubt the tweeter didn't know this, most likely it's a stall tactic.

CJK is a hot mess, but it is what it is.

That happens if you have no language hints, or the wrong one, e.g. posting in Simplified Chinese on a Taiwanese website. If this was written in something like MSWord by CCP officials, it should have the proper language hint, so render properly on any OS newer than XP.
I'm not sure what you mean by language hints.

Setting aside all other assumptions you make about the soundness of their setup overall, consistency of their input methods, newness of their inventory, etc., do you actually believe they would have the fonts with traditional glyphs in them installed and used at all? What for? Remember, this applies to the system as a whole. A character would be shown as simplified by the system even during input.

Again, I tried it and I got different results in different software (even on a Mac), with Pages in particular showing only simplified characters and straight layout (in contrast to Quick Look, which is what the tweeter must have used). Do you seriously think CCP officials have a fleet of Macs to check document appearance in case they are leaked and/or scan documents for "enemy" Unicode points? If not, how they would even know what code points are there, if all they ever see is simplified?

One needs to look at vocabulary, word choices and such. That is something that could actually point to fakery. Nothing like that was claimed yet, of course.

> I'm not sure what you mean by language hints.

Because of the han unification, you can tell the font renderer which language context you're in and want things to be rendered. MSWord shows you the language in the status bar at the bottom, which is not only used for spell checking. In html, you can add the lang attribute to a tag to tell the browser what language the contained glyphs belong to.

> do you actually believe they would have the fonts with traditional glyphs in them installed and used at all? What for?

Because ever since Vista, these come pre-installed regardless of your locale.

> Do you seriously think CCP officials have a fleet of Macs to check document appearance in case they are leaked and/or scan documents for "enemy" Unicode points?

I don't believe anything in particular, just adding technical context. The documents could also have been leaked through Taiwan or Hong Kong and then mangled there resulting in this.

> In html, you can add the lang attribute to a tag to tell the browser what language the contained glyphs belong to.

Well, you can visit probably any official Chinese government department website right now and see traditional Japanese characters instead of their simplified equivalents, if your machine happens to be configured that way. (Or at least the first one I stumbled across was like that, I pasted link somewhere in another comment. And I most certainly have Chinese fonts installed; in fact I see only simplified characters when I open the tweeted document in Pages.)

So they clearly do not make that effort even with documents actually crafted with foreign readers in mind. Presumably things can't be expected to be better if we are discussing secret documents intended for internal CCP consumption.

Right, and a premise of the tweet cited here is that different character sets mean you should just freely assume it's fabricated by Taiwan. It doesn't make that argument (at least not anywhere in the cited thread), it just presents an examination of characters with that as an underlying assumption.