Hacker News new | ask | show | jobs
by ltbarcly3 748 days ago
Not a footgun, it is well thought out and better than what you propose, and less surprising. (edited to be less insulting)

What makes you think that Postgres automatically making an arbitrary number of indexes on an arbitrary number of tables that you aren't trying to modify, that might be extremely bad for overall performance or take weeks to create, will save you from the rest of the things you haven't bothered to learn?

4 comments

I'll have to defend your parent commenter on this one.

Not having indexes for FKs is on average much worse for overall performance. Defaults should be reasonable.

In the great majority of cases you WANT to have indexes in FKs.

> expect the universe to magically fix all of your mistakes

This kind of derogatory hyperbole is not necessary nor productive.

I should expect tools to help me avoid mistakes. Not having an index on FKs is, more often than not, a mistake. It is reasonable to expect PostgreSQL to help me here.

I'll play devil's advocate. To be clear I generally agree that foreign keys should essentially always have a corresponding index, and that not including an index is a mistake far more often than it isn't.

My only counterargument is that—especially in production—adding indexes is expensive. Adding foreign keys is cheap. Latching a potentially expensive operation that can result in downtime to what should be (and often is expected to be) a cheap operation can cause an unexpected immediate loss of service. Though I believe (but am not certain) that in PostgreSQL's case, it should be relatively easy to recover from since DDL is transactional, and I can't see why you wouldn't be able to abort the index-creating transaction in progress. In MySQL I believe it's much more difficult to recover from this type of situation.

Not having an index on a foreign key can cause problems, but they tend to be of the long-term performance-reducing kind rather than the immediate outage kind. And in the event that adding a foreign key causes an issue, removing it is as simple as creating it.

That's all I've got: you're latching a slow, table-locking operating to what is expected to be an immediate one. Yes, I understand you're only suggesting this be the default, but I wouldn't expect most developers to predict the possible implications. Especially if the migration worked quickly in staging, where there's less data.

Edit: Actually, another one: if the index is created automatically, should it be removed when the foreign key is removed? This isn't a "problem" so much as a design issue with—I think—no necessarily clear, great answer. Just different choices with potentially-awkward tradeoffs.

Correct me if I'm wrong but, FKs are rarely created for existing columns.

You usually create the column and the FK in the same script. And usually starting with a NULL value for existing rows.

And if it's a new table then there's no rows anyway.

So the most common operations when creating FK's aren't expensive as far as I know.

You know what's expensive? Creating an index on a large table because you or your RDMS forgot to create the index when the FK was created and now JOINS are crawling to halt.

FK indexes by necessity need to be placed on the foreign table, which is just as likely to be a preexisting table that already contains data.

To be clear I 100% agree that adding indexes later is extremely painful. A little care when first creating tables goes a long way, and I’ve never seen a database fall over due to preemptive over-indexing but I’ve seen countless do so thanks to being underindexed.

Still, taking a DDL operation which is presumed to be essentially instantaneous and adding a default behavior that requires locking completely separate tables for a potentially-lengthy update does give me pause.

If I apply your reasoning, is there anything that is a footgun? I can just excuse anything as you are just expecting the universe to magically fix all of your mistakes & then the word loses all meaning.

Footguns are when your expectations are subverted in surprising ways. It means the system is set up to point a gun on a hair trigger at your feet and then just wait for someone to bump in and set it off - you could blame the person who bumped into the footgun or you could consider that maybe you shouldn't be pointing a gun at peoples feet on a hair trigger & blaming the person who was unlucky or clumsy enough to bump into it. Subverting reasonable expectations, having defaults tuned for the minority situation, and having inconsistent defaults are all footguns in my opinion. Footguns can be unavoidable in many cases when you don't have any reason to believe there is a majority or minority usage pattern, but that doesn't seem to be the case here based on what the author & people in the thread seem to be saying.

Arguing that someone needs to learn arbitrarily many things to properly use a tool is just gatekeeping; this isn't the only footgun in Postgres. If you notice, there's reflection going on here on whether there may be ways to improve the tool to begin with (e.g. maybe the default for FKs should be to index them given that that's what people usually do on FKs anyway & it's the default for PKs).

I think footguns are something thats 100% unexpected. I would argue that a user of a database that sees any kind of reasonable size ought to be very aware of how things work, because to be blunt (to use same language as GP), if you are caught off by this, chances are that you are not qualified to tackle the thing you're doing. It is scary how many in high positions have no clue about indexes, and either are unaware, or have gross misconceptions about how things work
You're right in that the developer ought to have control over the kind of index that gets created.

Having an ability to lint a DB and check for missing indexes would be useful, that could even be useful in a vanilla Postgres (however you want to provide it). Perhaps auto-creating them is a step too far.

Note that your statement: "you just don't know what you are doing" comes off as unnecessarily arrogant and off-putting.

Did someone pee in your coffee this morning?

I know very clearly how it works. It's essentially a tradeoff: don't create indexes by default, and you don't take the hit of creating indexes but then hit the risk of queries starting to fall over due to lack of indexes when you start to scale. Alternatively, create the indexes (again, by default), with the risk that you may be creating ones you don't need.

Importantly, all I am recommending is what the default behavior is - I still think you should be able to opt out when creating the FK. It's just that (a) in my experience you do end up wanting an index at least 80-90% of the time, and (b) not adding indexes usually has much worse implications than adding them unnecessarily.

Also, this is obviously an easy thing to get wrong given the number of times I've seen different developers hit this, and the fact that different DB engines have settled on different defaults (e.g. MySQL does automatically create FK indexes and requires an index on all FKs).

But please, continue, let me know what an idiot I am.

lol get a real dba, saying you know so many developers hitting a particular issue is just another way of saying I know so many people who don’t know what they’re doing.

RDBMS are complex and have to handle the most wide variety of applications of any modern software in existence.

Accommodating your particular niche use case of make Postgres be my own magical dba because I don’t know how to run a simple query to check if my FK columns have indexes is pretty silly.