Hacker News new | ask | show | jobs
by DenisM 2923 days ago
Two reasons to not have indexes in the query:

1. Query expresses the result it produces, not the method that was used to obtain it. Semantic vs implementation. It may be a pain to write, but it will be easier to read later.

2. DBA could add/drop indexes on the fly to tune performance of a live system without making any changes to the application code. And being 100% certain he is not changing the semantics of what's going on.

As others noted, if you must you can use query hints for force particular index to be used for a particular operation. MSSQL also allows to pin down a query plan you like for a given query so that it doesn't drift away later due to environment changes.

I agree it is sometimes a pain to force SQL to use the index you wanted it to use.

2 comments

I’ve never worked anywhere where I had to worry about DBAs running around dropping indexes. The main reasons not to build an index are usually storage and write overhead. Every index a table has means you have to do another write operation on every insert, which can really start to add up. They can also add significant overhead to any migration operation that happens to require an index rebuild.

In my experience, the most common reason for an optimizer choosing not to use an existing index, is out of date statistics. For those who aren’t aware, the database collects table statistics for things like cardinality, number of distinct values, etc... This is the information the optimizer uses when it’s building a plan. If they get out of date the optimizer will start to come up with nonsense plans. Even worse, if your stats get too out of date, you can become scared to update them, because a new set of stats can potentially change the plans built for every single query in ways that are hard to predict.

As others have stated, you can put index hints directly into your queries, but this should be avoided as they’re hard to maintain. Most ‘enterprise’ RBDMS also have some form of plan management, but this should be avoided even more, as managed plans permanently bypass the optimizer, which is even harder to maintain.

So, the way I think this should work is that there should be a way of addressing the table sources in a query from an application that has them parsed, and then externally (i.e., in the application) provide planning hints to the compiler.

Something like (in some terrible pseudocode):

    q = parse_query("...");
    q.hint(FIRST_TABLE, "a");
    q.hint(INDEX, "b", "b_idx1");
    c = q.compile();
    r = c.run(...);