|
|
|
|
|
by p_l
703 days ago
|
|
There's a semantic difference between "accented letter" and "different letter that happens to visually look like another language's accented letter". "Ą" in polish is not "A" with some accent. And the idea behind unicode was to preserve human written text, including keeping track of things like "this is letter A1 with an accent, but this is letter A2 that looks visually similar to A1 with accent but is different semantically". Of course then worries about code page size resulted in the stupidity of Han unification, so Unicode is a bit broken. |
|
Especially because the codepoint is actually called "Combining Ogonek".
And for anyone writing in Cyrillic, it's actually more accurate to use the combining form, even as its own letter, because the only precomposed form technically uses a latin A.
But my main point is that I do not think there is supposed to be any semantic difference in Unicode based on whether you use precomposed or decomposed code points.