|
|
|
|
|
by cryptonector
2733 days ago
|
|
If you're just comparing strings then just do character-at-a-time comparison, which allows you to decompose (no need to recompose) and only one character at a time (look ma', no allocation needed), compare the two decomposed characters' codepoints, then fail or move on to the next character. I call this form-insensitive string comparison. |
|
Also, if you think you can decompose without allocating memory... well, try a code point like U+FDFA.
For reference, its decomposition is:
U+0635 U+0644 U+0649 U+0020 U+0627 U+0644 U+0644 U+0647 U+0020 U+0639 U+0644 U+064A U+0647 U+0020 U+0648 U+0633 U+0644 U+0645
(and that doesn't begin to touch any of the potential issues with variant forms, homoglyph attacks, etc.)