Hacker News new | ask | show | jobs
by JoelJacobson 1906 days ago
Foreign Keys are great!

Too bad they were not invented until SQL-89, and could thus not be considered when the JOIN syntax we all know was designed and formalized by ANSI in SQL-86.

Imagine if we could simply give the name of a foreign key, as an argument to JOIN, letting the query planner look up the foreign key columns, to do the join on.

The default names for foreign keys are not very user-friendly, due to the SQL standard unfortunate decision to require the foreign key name to be unique per namespace, and not just per table, which would have made more sense.

In PostgreSQL, foreign keys just need to be unique per table, so there we could give them names such as "customer" instead of "rental_customer_id_fkey".

Imagine if we instead of:

  SELECT
    language.name,
    COUNT(*)
  FROM rental
  JOIN customer
    ON customer.customer_id = rental.customer_id
  JOIN address
    ON address.address_id = customer.address_id
  JOIN city
    ON city.city_id = address.city_id
  JOIN country
    ON country.country_id = city.country_id
  JOIN inventory
    ON inventory.inventory_id = rental.inventory_id
  JOIN film
    ON film.film_id = inventory.film_id
  JOIN language
    ON language.language_id = film.language_id
  WHERE country.country = 'Sweden'
  GROUP BY language.name
Would write something like:

  SELECT
    language.name,
    COUNT(*)
  FROM rental
  JOIN rental->rental_customer_id_fkey AS customer
  JOIN customer->customer_address_id_fkey AS address
  JOIN address->address_city_id_fkey AS city
  JOIN city->city_country_id_fkey AS country
  JOIN rental->rental_inventory_id_fkey AS inventory
  JOIN inventory->inventory_film_id_fkey AS film
  JOIN film->film_language_id_fkey AS language
  WHERE country.country = 'Sweden'
  GROUP BY language.name
After renaming the foreign keys:

  ALTER TABLE rental RENAME CONSTRAINT rental_customer_id_fkey TO customer;
  ALTER TABLE rental RENAME CONSTRAINT rental_inventory_id_fkey TO inventory;
  ALTER TABLE customer RENAME CONSTRAINT customer_address_id_fkey TO address;
  ALTER TABLE address RENAME CONSTRAINT address_city_id_fkey TO city;
  ALTER TABLE city RENAME CONSTRAINT city_country_id_fkey TO country;
  ALTER TABLE inventory RENAME CONSTRAINT inventory_film_id_fkey TO film;
  ALTER TABLE film RENAME CONSTRAINT film_language_id_fkey TO language;
We could write this as:

  SELECT
    language.name,
    COUNT(*)
  FROM rental
  JOIN rental->customer
  JOIN customer->address
  JOIN address->city
  JOIN city->country
  JOIN rental->inventory
  JOIN inventory->film
  JOIN film->language
  WHERE country.country = 'Sweden'
  GROUP BY language.name
And if allowing such a "foreign key operator" to be chained, we could write:

  SELECT
    language.name,
    COUNT(*)
  FROM rental
  JOIN rental->customer->address->city->country
  JOIN rental->inventory->film->language
  WHERE country.country = 'Sweden'
  GROUP BY language.name
This is similar to "4.9 Reference types" in the SQL standard ISO/IEC 9075-2:2016(E), but it wouldn't require a separate REF column, it would merely use the existing foreign keys which we already have in well designed proper database schemas. We would just need to give them better names.

Q: How would we name foreign keys if there are two going to the same table?

A: Imagine having a "users" table with two columns "child_user_id" and "parent_user_id", both referencing "users". The foreign keys on such columns could simply be named "child" and "parent".

2 comments

> Too bad they were not invented until SQL-89, and could thus not be considered when the JOIN syntax we all know was designed and formalized by ANSI in SQL-86.

TIL!

> After renaming the foreign keys

The idea about using FK as a JOIN target is interesting. While developing a syntax for PostgREST resource embedding[1], I also reached the conclusion that FKs would be a convenient way to join tables(also suggested renaming them as you do here).

IIRC, self joins are still an issue with FK joining.

[1]: https://postgrest.org/en/v7.0.0/api.html#embedding-disambigu...

Hi Steve,

Many thanks for making PostgREST, I'm using it for all my projects!

I'm trying to understand the problem you see with self joins. Why would it be a problem? At the end of my comment, I gave an example with a users table, for such a table, an expression such as "FROM users JOIN users->parent" would join-in users two times, but two different rows of users, since "users->parent" would follow the foreign key "parent".

You may be interested in the USING clause SQL-92 added for foreign keys where the referencing columns have the same name as the referenced columns: JOIN b USING (a_id). The only major RDBMS that doesn't support it is SQL Server afaik.
Yes, I know about USING, and you probably know about the one big issue it has; if you add a column name with the same name to some table included, then you get an error due to the ambiguity.

Example:

  SELECT
    language.name,
    COUNT(*)
  FROM rental
  JOIN customer USING (customer_id)
  JOIN address USING (address_id)
  JOIN city USING (city_id)
  JOIN country USING (country_id)
  JOIN inventory USING (inventory_id)
  JOIN film USING (film_id)
  JOIN language USING (language_id)
  WHERE country.country = 'Sweden'
  GROUP BY language.name
Works fine, but if we do

  ALTER TABLE rental ADD address_id integer;
Then we get

  ERROR:  common column name "address_id" appears more than once in left table
The foreign key based join approach doesn't suffer from this problem, since there which columns to join on is explicit and stable.