Hacker News new | ask | show | jobs
by malinens 1966 days ago
Too bad MySQL does not have partial indexes.

We have one huge table I want to add some indexes for specific cases (for max 1% of records) but server will not have enough memory for it if I add those indexes for all records :/

3 comments

As long as you've got primary keys on the huge table, there's a hacky solution -- create a second table with columns for just the first table's primary key and the columns you're indexing and your desired index, and ensure you always write/update/delete both tables simultaneously using transactions. Then when needed, use the index on the second table and join it to your first with the primary key.

Annoying, but it should work for most queries I'd expect without too much SQL.

I've definitely "rolled my own indexing" like this in the past, though it's more often been duplicating strings into a custom "collation" or other transformations.

Another solution is simply to split your table in two, with the same columns in both, and the index only on one of the tables. But of course that really depends on your business logic -- queries that need to retrieve data from both tables together can get pretty hairy/slow, and if you've got auto-incrementing PKEY's then avoiding collisions between the two tables can be tricky on its own. So this is definitely the less general solution.

Of coure it certainly would be nicer if MySQL supported partial indexes. It seems so useful, I'm surprised it didn't happen long ago.

The first approach is one of the steps towards normalising a database.
Actually it's the opposite of database normalization.

Normalizing removes data redundancy. This adds data redundancy.

When I design a database structure, it's common to start with the most normalized representation possible. And then to denormalize the minimum necessary for performance reasons -- duplicating rows and/or columns just like here so certain data can be retrieved more quickly, whenever indexes aren't powerful or featured enough.

I think what lucian1900 may be thinking is that instead of

    create table purchase_order (
        id int primary key,
        ordered_on timestamptz not null,
        customer_id int not null references customer,
        canceled_on timestamptz
    );
you could have

    create table purchase_order (
        id int primary key,
        ordered_on timestamptz not null,
        customer_id int not null references customer
    );

    create table order_cancelation (
        order_id int primary key references purchase_order,
        canceled_on timestamptz not null
    );
This is indeed a better normalised schema and it allows you to index order_cancelation.canceled_on without worrying about nulls.
Oh then, absolutely. I was assuming a constraint that columns couldn't be removed from the original table. But if you can, then your example is an even better solution.
Exactly, that’s what I thought was being described.
Could you create a temporary high memory slave MySQL server, sync the master, add the index, sync back to master, and decommission the temporary high memory? I don't know enough about master/slave operations to know if it would work.
'malinens doesn't have the storage space to store an index over a column if all values in the column are indexed. They want to index only some values, but not others. This feature does not exist in MySQL.
MySQL has pluggable storage engines. TokuDB does what you're after (adds indexes on the fly, as well as alter tables on the fly without overloading the server).
This page about TokuDB reads:

> TokuDB has been deprecated by its upstream maintainer. It is disabled from MariaDB 10.5 and has been been removed in MariaDB 10.6 - MDEV-19780. We recommend MyRocks as a long-term migration path.

https://mariadb.com/kb/en/tokudb/

Is MyRocks comparable?

Altering table online without using pt-online-schema-change doesn't help if they want an index that covers only some of the keys but not others.