|
|
|
|
|
by matja
309 days ago
|
|
> You can't even clearly define what an "atomic sequence of glyphs" is. Kinda. Grapheme cluster breaks are defined in Unicode, but they have all the baggage and edge-cases you'd expect from human languages evolving over time, so they can be encoded in as a few as a thousand rules : https://github.com/unicode-org/icu/tree/main/icu4c/source/da... |
|