Hacker News new | ask | show | jobs
by ewiethoff 5676 days ago
Would Perl's Unicode::Normalize module fit your bill? I believe Unicode::Collate uses it so that a single codepoint accented 'e', for example, sorts the same as 'e' with a combining accent.
1 comments

That's similar. What I'd really like to see is a string API that doesn't let you separate the letter and the combining accent at all, whether it has a precomposed normalization or not (unless you dig all the way down to asking which codepoints a grapheme is composed from).