Hacker News new | ask | show | jobs
by hashhash 4194 days ago
Locale information includes things like encodings to allow a human language to be stored as data.

It is probably the case that the most common encoding is ASCII, with the most common modern encoding being UTF-8. If you're writing code you don't want traced to a particular language, use ASCII.

You would only need a separate encoding if you were going to be writing the code with special characters. In this case a Korean encoding would only be useful for comments and string literals as most computer languages are ASCII based. Since the messages from the malware are apparently in English, this seems superfluous and more like a sign of a false flag operation. In this context, setting a Korean locale is an unnecessary and ill-advised step that would normally force you to go out of your way to get right.

Wikipedia has more specific information regarding Korean language encodings: http://en.wikipedia.org/wiki/Korean_language_and_computers

1 comments

Thanks, I should have made the connection to encoding immediately. I've found more info in <a href="http://www.theguardian.com/technology/2014/dec/02/north-kore... following Guardian article</a>:

“In the file we had a line with broken characters. Those characters didn’t render right under any encoding, except EUC-CN [Chinese] and EUC-KR [Korean] … In this case, the readme.txt file could be read fine under either EUC-CN and EUC-KR, which means the file was most likely generated from a computer set in either Chinese or Korean – or the hacker deliberately converted the file (which seems unlikely),” Karpeles said.

I should add that EUC-KR is a South Korean legacy character encoding, but the corresponding North Korean encoding (EUC-KP?) is hardly ever supported so in practice North Koreans would be likely to use EUC-KR.