| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ewiethoff 5676 days ago
	Would Perl's Unicode::Normalize module fit your bill? I believe Unicode::Collate uses it so that a single codepoint accented 'e', for example, sorts the same as 'e' with a combining accent.

1 comments

prodigal_erik 5675 days ago

That's similar. What I'd really like to see is a string API that doesn't let you separate the letter and the combining accent at all, whether it has a precomposed normalization or not (unless you dig all the way down to asking which codepoints a grapheme is composed from).

link