Hacker News new | ask | show | jobs
by bkuehl 1129 days ago
A lot of comments for Postgres, but it's the only major DB in 2023 that does not let you choose your character collation when creating a database. That is pretty much a deal breaker day 1. Guess you'll be doing a tolower() on every db search and not use indices which will kill performance or using column collation casts on every search query. I just don't get it.

I once tried to migrate a SQL Server DB to Postgres and eventually gave up, with MySQL being a pretty easy switch with some minor stored procedure rewrites.

Also it tends to do things way differently than every other DB. VACUUM is just a completely different concept that can footgun you pretty fast.

Postgres is pretty powerful but it has certainly made some interesting design choices.

2 comments

> doing a tolower() on every db search and not use indices

If you create the index with lower() it will uses that; e.g. "create index on tbl (lower(email))" and then "select * from tbl where lower(email)=lower($1)". That's more or less the standard way to do this but there are some other ways as well. It's more explicit than MySQL, so in that way it's better. It's more effort and easy to forget, and in that way it's worse – either way: it's definitely possible to do case-insensitive searches with indexes.

When I first used PostgreSQL I ran in to "how the hell do I do this?! MySQL just lets me [...]" kind of issues, but after many years of PostgreSQL usage I now have the opposite when I use MariaDB, which also has its share of awkwardness and issues (just different ones).

> It's more explicit than MySQL, so in that way it's better.

It sounds like you're under the impression that MySQL just makes everything case-insensitive and is silent about this? That's decidedly not the case.

MySQL 8 ships with 41 different character sets, supporting a total of 286 different collations. Collation names explicitly include "ci" (case-insensitive) vs "cs" (case-sensitive), as well as "ai" (accent-insensitive) vs "as" (accent-sensitive), and also the language/region conventions used for sorting purposes.

You can choose collation at the column-level granularity, as well as setting defaults at the table, schema, and server levels. It's completely explicit and very configurable.

There is no way to see from the query itself if it's case-sensitive or insensitive; that is what I meant.
Eh, just from a SQL query alone, there's no way to see that (lower(email)) is indexed, or see column data types etc. That metadata lives in the table definition, which is a normal place for it, so it seems reasonable for the collation / case-insensitivity to not be explicit in the query text.

Besides, MySQL also supports functional indexes, so you could do the (lower(email)) approach in MySQL too if you really want!

No, you can't see everything, but you can see the exact comparison it's making. Is that useful? You can decide that for yourself but I like being able to see as much of the logic (and thus verify the correctness) in the query itself. Also helps with copy/paste and some other things.

I never said you can't use functional indexes in MySQL. Someone said "you can't do this in PostgreSQL" and I just commented "here's (one way) to do this, maybe that's helpful some day". That's it.

My apologies, I misunderstood "It's more explicit than MySQL" to imply that you were saying that approach couldn't be used in MySQL.
I was disappointed with MariaDB and tried it before using MySQL. It is so far behind MySQL that it can't be considered equivalent anymore. And I really wanted to use MariaDB instead.

The "MySQL just lets me" stuff eventually adds up and hinders development. For example, your lower() on the param example is now incompatible with most ORMs unless you do that in code or create a special SQL statement for that. This isn't all fringe cases that you run into when you're months in either. I really wonder on some of these comments saying they've vetter both and Postgres wins hands down.

Postgres is solid but it definitely has its warts and downsides too.

Am I missing something here? Postgres does allow you to choose character collation when creating a database, as well as when creating new columns.
I'm sorry, I was wrong in that Postgres does let you specify a few character collations at the DB level, but they are pretty much ASCII vs UNICODE with no case-insensitive configurations. You can create collations in v12 and assign them to particular columns.