Hacker News new | ask | show | jobs
by auxbuss 5707 days ago
A couple of years ago I had to devise PHP sort algorithms for Danish, German, and Russian. (Unsurprisingly, native speakers are essential for testing this kind of thing ;) )

It was surprising at the time that these sort algorithms were not generally available in code. We also found that we couldn't rely on the db (MySql) to do the right thing, which was not helped, of course, by having these character sets, and others, in the db. The db was fully utf8.

Once you grok PHP's underlying method, it's simply a case of using the above mentioned technique -- replacing the "foreign" symbols with character pairs -- to establish the correct order.

In the Danish case above, for example, I used zx, zy, and zz.