Hacker News new | ask | show | jobs
by sbuttgereit 2821 days ago
First, I'm sure mere inertia plays a role, but there are practical reasons as well if you're writing more than trivial SQL queries.

In most other languages you can read the statements fairly linearly, one statement leads to the next, and to the next, etc. At least within functions/methods and then you usually have indentation to help with understanding boundaries. Each "logical line" may be spread out over one or even several on-screen lines, but it's fairly easy to see the whole logical line and understand it at a glance.

Not always so with SQL.

If I get a complex SQL query to analyze, the first thing I do is not look for the SELECT clause (I'll usually have some sense of that before I start), but rather the FROM clause including it's JOINs and such. When keywords are uppercase, scanning the page for where in the statement to start my reading is much easier. Next I want the WHERE clause and its predicates... and finally I want to see what's being selected. So I'm not reading things nearly as linearly to understand a single logical statement like I would to understand say, a well written function. Also, single queries can be very long, so being able to jump around them where there are not going to necessarily be other cues can be important.

I could be wrong, but the density of user defined names (schemas, tables, columns, etc) also is higher than in other languages, so having the keywords in uppercase makes those more spread-out statement boundaries and fewer numbers of keywords in uppercase again helps me scan the statements for those major parts as I need them.

If you're writing single table queries, no sub-queries, maybe a dozen of columns or so, and only a relatively small number of WHERE clause predicates, then it might not matter much. For better or worse, that's not the world I usually find myself in :-)

1 comments

Wouldn't syntax highlighting help with most of the issues you've outlined?
If I've got my developer hat on, yeah, sort of. I still find the case differences more pronounced without necessarily forcing me into a more in-my-face code formatting theme. Also, for all the reason A code format is important, the fact that I have 20 years experience with up-to-now SQL norms mean that I'm trained to look for certain things and can, almost without thinking, focus in on those with the help of historical formatting. Arbitrary changes to that history mean that now I have to discard that muscle memory and learn new ways of looking at things... I'm not sure the gain of a few less caps locks keystrokes merits forcing that learning curve. Sure the kids won't care about that because there's nothing to lose, but some of us old timers lose the benefit of our experience.

But developers with good tooling to deal with more static views of the code aren't the only readers of SQL, and perhaps not the most common.

With my DBA hat on, trying to figure out what's going on reading through logs and such, I don't always have that kind of tooling available. In this context the casing can help as well.... and this may form some of my bias; my first professional experiences on the application side of the world were in very much DBA roles dealing with other people's applications running on my servers. Insofar as access tools preserve developer formatting, having good "color-blind" formats made reading complex queries easier.

Yeah indeed -- though I'd prefer to write all lowercase for the sake of ease, I too stick to the 'uppercasing' norms mainly to help with ease of readability -- especially in shell-only environments where syntax highlighting isn't available [e.g., using psql]
Syntax highlighting is a huge boon working with SQL, but unfortunately you often find you are working with SQL embedded as a string in some other programming language.

I wonder if programming languages should have a way of specifying that a string constant contains embedded code or a template, so text editors can display it accordingly.

I deal with a lot of SQL embedded in code, and prefer uppercase keywords as it seems to help me visually parse the overall file.

PyCharm (and likely other editors by Jetbrains) supports embedded programming languages! However, language level support so that this could be automatically done by all editors would be great.
a lot of programming languages do have a way of specifying that a string constant contains embedded code or a template. Sublime Text for example will look for PHP heredocs with an identifier of XML, HTML, SQL etc. to know what to lex/highlight the string as. Or it will check whether the string starts with an uppercase "SELECT", as the chances are it's gonna be SQL. Code fences in GitHub Formatted Markdown are also a good example of this hinting.