|
|
|
|
|
by rraval
2720 days ago
|
|
> the solution for which is to use icu We ran headfirst into this issue at my company and we've actually been recommending the opposite (use the "C" locale on the database, treat collation as a render level concern). I have a whole write up explaining the technical motivations behind that recommendation: https://gist.github.com/rraval/ef4e4bdc63e68fe3e83c9f98f56af... |
|
That is, to do anything you want to do a locale sensitive way, such as querying the database for a given case insensitive string, or pagination, you'll need to have the DB index be locale/collation aware or else return every possible value and the renderer sort it out.
As an example, how you would you repeatedly return a range of results based on a string in a locale-aware way, e.g., to display paged results, if you defer the work to the renderer?
The only general solution I'm aware of which lets you "bypass" the DB collation is to use a locale-aware collation library like ICU to generate a binary sort key, which can be compared using plain binary comparison and storing those in the DB. This still means the overall index is locale aware, but the DB doesn't need to be aware of any collation rules: only the code that generates queries and handles the results needs to do the sort key transformation.
It means that you have a single library that does all the conversion, which you can probably control more easily, rather than delegating this to the database, where you might need to support several vendors or at least various versions (and problems can arise even within a single DB version as this postgres issue shows).