Hacker News new | ask | show | jobs
by mtone 2980 days ago
I'd bring it a little further and would write:

SELECT * FROM Person as P INNER JOIN TeamMember as TM on TM.PersonId = P.Id

I have:

- Aliased each table and prefixed every field names with their table alias in my join conditions.

- Explicited the JOIN type.

The above:

- Reduces mistakes due to ambiguities that tend to generate unwanted duplicates rows in SQL.

- Increases the likelihood of getting an error at parse time, instead of run-time or analysis-time, thanks to added scoping.

- Works in any schema, no matter what naming conventions are followed.

- Keeps working as the query becomes more complex with multiples table aliases or self-joins, and similar field names appearing in the set.

- Better expresses intent. Sure JOIN defaults to INNER JOIN, but writing "INNER JOIN" shows that you genuinely expect any row not matching your condition to be removed from the result set.

3 comments

I'm not a fan of short aliases. They obscure what you are attempting to do.

Personally I write very little SQL anymore (the ORM does that for me unless I need performance), but the only time I use aliasing is when the same table is joined multiple times.

Like:

SELECT Mother.Name, Father.Name, Child.* FROM Person Child JOIN Person Mother ON Mother.Id = Child.MotherId JOIN Person Father ON Father.Id = Child.FatherId

That said, this naming convention also obscures what Person.FatherId points to without looking at the FKeys. So Take that with a grain of salt.

As a newbie, this makes more sense to me:

  SELECT * FROM person, team WHERE person.team_id=team.id
(no unnecessary aliasing, no weird JOIN phrasing)
I can understand not using aliasing in a simple query, but I'd recommend against using these non-ANSI joins (deprecated syntax). What you're writing can be interpreted as either:

SELECT * from person CROSS JOIN Team WHERE person.team_id=team.id

SELECT * from person INNER JOIN Team ON person.team_id=team.id

That they happen produce the same result in a query is practically just luck. Changing the join from INNER to LEFT OUTER is also much easier than managing (+)'s in the WHERE clause, once you're used to it.

This is good advice, IMO. You will have to use left/right joins at some point, and having the join type at the join location is useful, but having the join conditions at the join location is immensely useful.

It's harder to see in your example, likely because you tried to keep it similar to the example presented, but once there's more than a couple joins, having the join conditions close to the join is essential for keeping track of what's going on. e.g.

  SELECT * FROM person, team, role, group, person AS lead
  WHERE person.team_id=team.id
    AND person.role_id=role.id
    AND person.group_id=group.id
    AND group.lead_id=lead.id
Compared to:

  SELECT * from person
    INNER JOIN team ON person.team.id=team.id
    INNER JOIN role ON person.role_id=role.id
    INNER JOIN group ON person.group_id=group.id
    INNER JOIN person AS lead ON group.lead_id=lead.id
And let's be clear, we know these should be left joins, because the chance some person doesn't have a team, role, group, or a group ends up without a lead is high when sampled over time. And simulating left joins with the non-ANSI joins quickly gets unwieldy.
Aliasing a query that short just obfuscates intention. It might be necessary for some larger queries, or joining the same table multiple times, but there is a readability cost you are paying for it.

For example:

JOIN team_member ON team_member.person_id = person.id

It's completely self evident what is being joined without the need to trace back to the table aliases.

Pretty much all of your advice is premature optimization in my eyes. You can/should do those things when they are needed, but there is no reason to automatically write every single SQL query that explicitly.