I was considering the fact that when it adds 0x8000 or whatever it's doing it's hitting 0x1.... codepoints and doing weird things with those because of the encoding. Here's a trace of 한글 through this 'rot8000', though:
So... yeah. Weirdness all around. Might have better luck doing this with some carefully crafted xor pad for each codepoint so that it's likely to hit a printable character but impossible to hit a character in the 0xD800..0xDFFF range (and similar ranges)... trying to "wrap" in unicode would require reinterpreting the codepoints to some continuous numeric representation.
한글: 0xd55c 0xae00 똼軠: 0xb63c 0x8ee0 霜激: 0x971c 0x6fc0 矼傠: 0x77fc 0x50a0 壜ㆀ: 0x58dc 0x3180 㦼በ: 0x39bc 0x1260 ㆀ: 0x1a9c 0x3180 㦼በ: (repeating)
So... yeah. Weirdness all around. Might have better luck doing this with some carefully crafted xor pad for each codepoint so that it's likely to hit a printable character but impossible to hit a character in the 0xD800..0xDFFF range (and similar ranges)... trying to "wrap" in unicode would require reinterpreting the codepoints to some continuous numeric representation.