Hacker News new | ask | show | jobs
by lelf 4617 days ago
It's broken.

Λ̊1 → ⊻∪ά → Λ̊⋌

𝄞 → 뤔뷾 → 駴點

Edit: anyway, even with correct (a+b)%n it's plain bad idea.

Unicode is not English alphabet. Everything not in basic multilingual plane is broken automatically. And even in BMP there's going to be bag of glitches starting from hanging combining characters and ending to ‘oops someone normalised our string and it's now different’ (for site, not for user / Unicode).

2 comments

Pretty sure it's meant as a joke
It is meant as a joke -- but also planning to fix these issues ... apparently this was the right place to bring it to find all the situations where it doesn't work correctly :)
Rather than rotating through the entire BMP, I would suggest instead using Unicode's localized collations, and just rotating every character that's part of a fully-orderable "alphabet" set through that set according to those orders. (This means, for example, rotating Japanese hiragana, but not kanji.)
CJK support is fixed now.