| I'm actually curious what the distinction is in your view. I've never really considered a column to be a foreign key when the constraint isn't used. Having a column that we give business logic context to is useful, and indexing a column that should contain values for another table is helpful for query speed, but at least in my opinion they really aren't foreign keys unless that constraint lives directly in the database layer itself. I'd say the same for columns that are used as unique identifiers without actually adding unique constraints to the column. There are good performance reasons to do either one if you're willing to take on the data integrity responsibility in the application code, but the column itself really is just a typed column if the constraints live elsewhere (again in my opinion, I think the technical definitions may ignore this functional argument). Where I find foreign key constraints helpful for queries is when I need to be absolutely sure of the data integrity. Say I need to make a complex query that joins across three different tables based on foreign keys. If the table constraints exist I know that (a) any value in a foreign key column is valid and the referenced key exists and (b) if no rows are found I can trust that its just because none exist. Without foreign key constraints, I may not know why the query didn't find any results. It could be because there just aren't any matching rows, it could also be that one of the keys is no longer valid (or never was). If I don't care about that second error state my query may not change much, but if I need to know why the query failed to match and handle any invalid data accordingly I couldn't do it. When writing, I also much prefer having a single insert that I know will fail if the foreign key isn't valid. This could be done with a more complex query, or a transaction, but then I'd be taking on that responsibility when it could live directly in the db. Beyond the complexity there, I have to assume the database authors would be able to write a more efficient foreign key validation check them I could from my end. That said, what's been your experience handling foreign keys when the constraint is either unsupported or unused? Do you avoid it mainly for the bump in query performance, and if so how do you avoid that performance hit elsewhere in your code? |
In my experience, working on systems where foreign key constraints are liberally applied has been a net negative. Certain classes of DML statements (ON DELETE CASCADE I remember as being infamous) are certainly worse in performance than they otherwise might be. As an administrator I remember being repeatedly and painfully hamstrung by the inability to make arbitrary DB writes, which may momentarily violate strict data integrity, but are necessary for immediate practical reasons.
Obviously data integrity suffers without explicit constraints, but I'd rather work to backfill and/or clean up messy data than deal with a frustratingly rigid and poor performing system. I've worked on a number of large-scale MySQL database deployments at various tech companies, and I can't recall many, if any, that required or possessed pristine referential integrity. I can see why it's conceptually compelling, and I appreciate how automated tooling can generate very useful entity relationship diagrams if FK relationships are explicitly spelled out by constraints.
I think the "performance hit elsewhere" only happens given the assumption that strict referential integrity is a requirement, perhaps in a banking context or another where messy data simply cannot be tolerated.