Hacker News new | ask | show | jobs
by aidos 3340 days ago
Curious, this doesn't fit my mental model: it's nonsensical to have a foreign key that isn't indexed

I guess you normally want indexes on parent - child tree relationships (book -> chapter), but you don't want them all the time. What about when you have an 'article' with a 'status and a 'category' and you only ever find all articles by combination of category and status? In that case you'd be maintaining 3 indexes, category, status and status_category, but the only one you'd need would be status_category.

Indexes are a tool to to allow for optimising lookups while foreign keys are a tool to allow you to keep your data consistent.

2 comments

I don't see how your example doesn't require an index, specifically because, as you said, indexes are a tool to allow for optimizing lookups.

I imagine that maybe you're suggesting you first query the DB for the ID of the combined status_category, and then query the article by that ID. That's not a good idea, for a couple of reasons. First, you're making two round trips to the DB when you could, with no additional effort (just effort in a different place) be doing one. Second, you've introduced a data race condition. If someone deletes that status_category after you've queried for it but before you've queried the articles, you aren't going to get the results you want.

It would be better to do a join across articles to status_category to status and category, then query based on the status and category values you want. Without an index on the FKs between status_category and status and category, a relatively small table can have a big impact on query performance.

Finally, while I know your example is arbitrary, it's a little hard to argue against a design that is probably wrong. I doubt the suggested schema for articles and categories is a good one. If I argue "you should never have to arbitrarily subtract '1' from a result just to get the results you want", it would not be a good counterargument to say, "yes, but sometimes you want to add 2 and 2 and get 5, so then you need to subtract 1". The problem isn't where you see it.

FKs aren't just a consistency tool. Consistency and referential integrity are features that results from having an FK, but the FK is a signal that data can be searched in a certain way.

> I don't see how your example doesn't require an index, specifically because, as you said, indexes are a tool to allow for optimizing lookups.

If you never lookup by a column alone, you don't need a single-column index on that column. An FK need not ever be a lookup target (the target column[s] it references are necessarily a lookup target, but not necessarily vice versa.)

An FK is probably usually going to want some kind of index, but it's not nonsensical to have a non-indexed FK.

Touché! An index signals the data should be searched in a certain way!

I don't think I explained my arbitrary scenario particularly well :-) I'm definitely not suggesting 2 queries.

If I have articles that could be category: math|science and source: website_a|website_b|... and I only ever query for source and category together then the other indexes aren't used.

It's a contrived example, but in my mind the existence of a foreign key doesn't imply an index is required.

I assumed they meant an index on multiple columns (status, category), not a separate table for status_category with an index on that.
I think in general most rdbms require indexes to properly enforce foreign key constraints in a reasonable way.
If the parent entities are immutable, or are at least immortal and have primary key values that are immutable, then the DB engine will only need the index on the target primary/unique key/index to maintain the constraint.

Though in general this is not the case, as you say, it is the case often enough that enforcing "FK means an index" could be an annoyance.