Hacker News new | ask | show | jobs
by Jongseong 4194 days ago
The article has a really weak grasp of the language situation in the Koreas. Setting aside the conclusion of the article, here is my input on the language-related points (the first two), coming from a South Korean.

I have to say I find point 1 borderline offensive, that the English basically isn't bad enough to be authentic "Konglish". It can't have been written by North Koreans unless you see comprehension mistakes! Does the author know that perhaps counterintuitively, English is the most widely taught foreign language in North Korea? Or is he familiar with the barrage of English-language propaganda put out by the North Korean regime?

I wouldn't describe it as "broken English" either. Stilted and unlikely to have been produced by a complete native speaker, yes (e.g. old-fashioned English subjunctive in "our request be met"), but not ungrammatical. I have no particular trouble believing that it is an earnest attempt by a non-native speaker to write correct English.

Point 2 is the weakest. I have no idea where the author got the notion of North Koreans speaking their own dialects and traditional Korean being forbidden. Korean like any language has regional dialects in both North and South Korea, but the language itself was standardized before the division of the peninsula based on the Central dialect region around Seoul. This dialect region is split between the North and South so that for example the speech in Kaesong, North Korea is similar to the speech in Seoul, but Pyongyang falls outside this and falls into a different dialect region. Nevertheless, because Standard Korean was established before the division, the standard speech in North Korea is also based on the Central dialect. The Standard Korean spoken by someone from the North is not as different from what you would hear from someone from the South as one might imagine, as South Koreans may verify by watching a North Korean news broadcast. There are of course differences in orthography and vocabulary similar to what you would find between the UK and US in English (thus the "helicopter" example supplied by the author), but this has more to do with a natural divergence of the language after decades of forced separation than anything.

The closest thing I can think of to the notion of traditional Korean being forbidden is that North Korea banned Chinese characters from official writing right away, while South Korea didn't go as far but still eliminated Chinese characters from texts used in education. Korean has its own alphabet, but Classical Chinese was the traditional literary language, and Sino-Korean vocabulary (words derived from Classical Chinese) were often written in Chinese characters in a "mixed-script" style reminiscent of Japanese. In both Koreas, the end result was that Korean came to be written purely in the Korean alphabet. In South Korea this was gradual as the mixed-script style held on for a few decades, but by now most South Koreans have been educated writing only using the Korean alphabet. At any rate, Koreans wouldn't be using Chinese characters on computers anyway, North or South, so this is an irrelevant historical detail by now.

What does the author mean by saying that "the code was written on a PC with Korean locale & language"? That the actual coding was done in Korean? What kind of programming language used by hackers is in Korean? I am not familiar with the details of the Sony case so I would like to be enlightened on what the author actually means here.

1 comments

Locale information includes things like encodings to allow a human language to be stored as data.

It is probably the case that the most common encoding is ASCII, with the most common modern encoding being UTF-8. If you're writing code you don't want traced to a particular language, use ASCII.

You would only need a separate encoding if you were going to be writing the code with special characters. In this case a Korean encoding would only be useful for comments and string literals as most computer languages are ASCII based. Since the messages from the malware are apparently in English, this seems superfluous and more like a sign of a false flag operation. In this context, setting a Korean locale is an unnecessary and ill-advised step that would normally force you to go out of your way to get right.

Wikipedia has more specific information regarding Korean language encodings: http://en.wikipedia.org/wiki/Korean_language_and_computers

Thanks, I should have made the connection to encoding immediately. I've found more info in <a href="http://www.theguardian.com/technology/2014/dec/02/north-kore... following Guardian article</a>:

“In the file we had a line with broken characters. Those characters didn’t render right under any encoding, except EUC-CN [Chinese] and EUC-KR [Korean] … In this case, the readme.txt file could be read fine under either EUC-CN and EUC-KR, which means the file was most likely generated from a computer set in either Chinese or Korean – or the hacker deliberately converted the file (which seems unlikely),” Karpeles said.

I should add that EUC-KR is a South Korean legacy character encoding, but the corresponding North Korean encoding (EUC-KP?) is hardly ever supported so in practice North Koreans would be likely to use EUC-KR.