|
|
|
|
|
by trezor
5706 days ago
|
|
Clearly you have not ventured too far outside the "safe" en-US cultures ;) In the Scandinavian languages (Danish, Swedish and Norwegian) "aa" is (for whatever historical reasons) considered a synonym for "å", which is that last letter of the alphabet. In the same way "ae" maps "æ" (third last) and "oe" maps to "ø" (second last). Hence, using culture-aware sorting, the following array may actually not be sorted: [ "a", "aa", "ae", "b", "oe", "of" ]. You will find similar sort behaviour in a lot of databases when you set the database/table/column collation to other cultures. While I agree the PHP implementation is silly, your blanket dismissal of the possibility that 'aa' can't be greater than 'z' is equally ignorant. |
|
It was surprising at the time that these sort algorithms were not generally available in code. We also found that we couldn't rely on the db (MySql) to do the right thing, which was not helped, of course, by having these character sets, and others, in the db. The db was fully utf8.
Once you grok PHP's underlying method, it's simply a case of using the above mentioned technique -- replacing the "foreign" symbols with character pairs -- to establish the correct order.
In the Danish case above, for example, I used zx, zy, and zz.