Hacker News new | ask | show | jobs
by masklinn 5401 days ago
> Wait, really? I thought normalization form C was the form where composite characters were always used when possible.

NFC is the form where composed codepoints are used the most, hence the "smallest" form post-normalization.

> Why restrict by the number of codepoints (vs characters) if you're explicitly going to use the form which goes out of its way to use multi-codepoint characters?

They're not, that would be NFD.

1 comments

Well, crap. This is why Unicode is hard, folks. Thanks for the correction, masklinn.