|
|
|
|
|
by IncreasePosts
793 days ago
|
|
Isn't that exactly what the Unicode collation algorithm is? https://unicode.org/reports/tr10 tl;dr from its wikipedia page: > The Unicode collation algorithm (UCA) is an algorithm defined in Unicode Technical Report #10, which is a customizable method to produce binary keys from strings representing text in any writing system and language that can be represented with Unicode. These keys can then be efficiently compared byte by byte in order to collate or sort them according to the rules of the language, with options for ignoring case, accents, etc |
|