Hacker News new | ask | show | jobs
by irrational 2406 days ago
In regards to formatting sql, I used to do it the way shown, but a coworker formatted the columns in the select with the commas in front. This seemed strange to me until I tried it. I realized that this solved the problem of sometimes a query would be changed and the last item in the select list would be removed, but the last comma would not be removed. Or, a new item was added to the end of the select list, but they neglected to add in a comma at the end of the previous last item.

SELECT

  col1

  ,col2

  ,COUNT(col3)
FROM

  t1

  JOIN t2 ON ta.pk = t2.fk
WHERE

  col1 = col2

  AND col3 > col4
GROUP BY

  col1

  ,col2
HAVING

  COUNT(col3) > 1
3 comments

I hate hate hate the way it looks, but unfortunately it is objectively superior. This is why I prefer more languages go out of their way to make their syntax tolerant of optional trailing commas.
I don't bother with this but instead try to keep column names alphabetical. I try to do the same thing in other languages with property names, etc. This gives greater stability in version control, as it removes the temptation to change the order for subjective reasons. Even better if it can be automated with linting.

This "fix" only helps with the case where new items are inserted at the end. By using alphabetic ordering you increase the chances that a new item will be inserted at the beginning or the middle, in which case it makes no difference where you put the commas.

It only really helps at all because inserting new items at the end is common, whereas removing items from the start is rare, at least in SQL (all it really does is displace the dangling comma problem from the former case to the latter). However, always inserting new items at the end tends to lead to unintuitive ordering, which in turn leads to additional VCS churn when that becomes seen as technical debt.

Commas at the start does give objectively better readability in one sense: it ensures they are aligned vertically. That makes it easier to spot errors at a glance. You might call this "ease of formal reviewability".

However, in practice it seems to be worse for general readability, for the entirely subjective reason you already pointed to. Since code is typically read more often than it is written, it's important to optimise for that first.

Since an error here will always cause a hard fail, it falls into a different category than say, omitting braces in a C-style if statement, which can introduce subtle bugs through bad merges. In the latter case, ease of formal reviewability has to take precedence over subjective aesthetics.

This also avoids the majority of merge conflicts. Which is an even better reason to do it.
Well, this simply moves the problem of "commenting out" one column from the end of the list to the start of the list. I find myself doing both equally frequent.
where 1=1 and

  ,first=condition
Similarly, by having the AND start in each subclause instead of being at the end of the previous clause, it's easier to "--" comment out specific clauses during development. Again, harder to read, but easier to work with.
for whatever reason I actually find the leading "and/or" to be more readable than the trailing and/or, as long as there are more than two. The leading commas look too noisy to me, though
Personally I find that one easier to read and tend to write e.g.:

  WHERE foo
  AND   bar
Or with long, complex ones:

  WHERE
      foo
  AND
      bar
Interesting. Can you please provide an example of what you mean?
WHERE

T.first_name = "Ender" AND

T.last_name = "Wiggins"

versus

WHERE

T.first_name = "Ender"

AND T.last_name = "Wiggins"

you could also do

WHERE 1=1

AND T.first_name = "Ender"

AND T.last_name = "Wiggins"

Thanks a lot, I thought it might have been WHERE 1=1

I will now use that to easily remove filters.

An extension for the SELECT based on the very insightful top comment could be

SELECT NULL

,something1

,something2

My OCD simply refuses to allow me to write it that way. I also tend to write SQL in lowercase which is apparently semi-controversial.
My (diagnosed) OCD compels me to use leading commas. I also use "WHERE 1=1" so each where constraint begins with AND/OR. I tend to write a lot of "investigatory" SQL that requires frequent commenting. I rarely need to comment out the first column or constraint, so commenting in general is much cleaner & faster.

I did not learn SQL this way. I was shown these tricks and their adoption was near immediate BECAUSE it just made more sense to my brain.

This isn't to devalue/argue against your comment. I just find it interesting that the common denominator for OCD-ish compulsions is that the "right" way for an individual OCD brain is usually specific to that individual brain.

If you write a lot of SQL, writing it in lowercase is OK. That's what I do anyway. If I insert simple prepared statements into code, I usually write keywords with uppercase, but larger code blocks, I write in lowercase. I also tend to write in much fewer lines than the above example, and indent FROM and other clauses further in than the SELECT keyword (so I know they belong to that SELECT statement).