| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by daxelrod 3725 days ago
	It's long bothered me that SQL doesn't have a way of saying "join these two tables based on the key relationships I've already defined in my schema". (Or maybe some variants do?) The database already knows that guest.reservation_id is a foreign key referencing reservation.id. Why should you have to repeat yourself?

4 comments

Someone 3725 days ago

I have longed for that, too, but I'm not sure I should want it.

Problem is that it becomes problematic as soon as you keep such queries around (in stored procedures, source code, or batch files), and then modify your schema by adding or removing a foreign key.

link

ComodoHacker 3725 days ago

What if you have more then one foreign key referencing the same table? Or worse, what if you had one and later add another? You would get undefined behavior.

Added:

Also SQL language itself has nothing to do with keys, relationships and other details of database design. It's different level of abstraction.

Various query builders/SQL generators do what you want.

link

greggyb 3725 days ago

And when there are multiple foreign keys between two tables?

link

kbsletten 3725 days ago

Genuinely curious, what's the use-case for this?

link

level3 3725 days ago

Some examples:

In project management, tasks might have an assigner_id and assignee_id

In transaction management, transactions might have a sender_id and receiver_id

In sports, matches might have a team1_id and team2_id

link

jonahx 3725 days ago

> In transaction management, transactions might have a sender_id and receiver_id

in normal form, shouldn't you have "transactions", "transaction_senders", and "transaction_receivers" anyway? similarly for the other cases?

i know sometimes the simplicity of a single table is a valid choice, so your point still stands, but it does feel like the examples are cases of "well, if you're were doing the right way(TM), that wouldn't happen"

link

dsp1234 3724 days ago

shouldn't you have "transactions", "transaction_senders", and "transaction_receivers"

were doing the right way(TM), that wouldn't happen

Having a separate transaction_senders and transaction_receivers wouldn't be the right way(TM) unless it's possible to have multiple senders/receivers. If a transaction can only have 1 sender/receiver then normalization is complete when the sender/receiver data is removed from 'transactions' table and replaced with the sender_id and receiver_id columns.

When introducing transaction_senders/transaction_receivers tables without further constraints, it would be immediately possible to have multiple transaction senders and multiple transaction receivers for a single transaction, which is likely an error.

link

level3 3724 days ago

These examples depend on the use case. If you are guaranteed to only have one sender and one receiver, then having them in the same table is already normal form.

Perhaps (?) a less controversial example is a table of flights between airports. Should a single flight have departure_ids and destination_ids? (assuming we treat legs as separate flights)

Or what about package delivery? Do we need sender_ids and receiver_ids there?

link

cname 3725 days ago

created_by and updated_by both pointing at the user table?

link

slagfart 3725 days ago

Explicit foreign keys perform better in some joins, as the optimiser can assume that all possible values are in the source column of that foreign key.

link

vog 3725 days ago

> join these two tables based on the key relationships I've already defined in my schema

This is almost exactly what "NATURAL JOIN" does. It is like "JOIN USING" with a pre-filled column list.

However, this isn't decided based on the schema, but based on equal column names. So you need to apply the dicipline as noted by the parent comment.

See also the sibling discussion thread at: https://news.ycombinator.com/item?id=11450187

link