Hacker News new | ask | show | jobs
by avinassh 1402 days ago
How is string reversal is done with unicodes?
1 comments

Combining characters are the most obvious problem.

Both of these are visually identical as "naïve", however the first is written with "ï" being a single code point, while the second is an "i" followed by a combining dieresis. In the first example, the dieresis correctly stays attached to the i, while in the second dieresis incorrectly moves to the v. To do it right, you have to scan through the string and keep the base character and all combining characters in order.

"na\u00efve".split("").reverse().join("") // CORRECT: "evïan" "ev\u00efan"

"nai\u0308ve".split("").reverse().join("") // INCORRECT: "ev̈ian". Should have been "evi\u0308an", not "ev\u0308ian".