Hacker News new | ask | show | jobs
by zbuf 2004 days ago
Leaving aside the modest achievements of the code so far, the missing prerequisite to "automatically" formatting SQL is... just formatting SQL.

Because SQL (broadly) was designed to be "human" readable in the first place, it's grammatically very inconsistent and with a lot of keywords. Much more than other languages in use today such as C.

I've yet to find a pattern of indentation, brackets etc. that satisfies my OCD.

Coming up with an example of a nicely formatted SQL statement is not difficult, but turning that into consistent 'rules' and immediately you find counterexamples using other parts of SQL.

1 comments

I make an effort to indent on most keywords, using a Python-like structure to indent related lines to the same level, one more than the keyword operating on them all, with the heuristic that if I can comment one or more lines to debug, I have a readable query.

I tend to classify SQL statements into two kinds, those that when wrapped in a calling function fit in one screen, and those other longer ones that I'm inclined to write in an imperative language.

Edit: For the author of the repository, the list of reserved words gets longer and more complex when you support different implementations of SQL, and regex may be insufficient once you consider such parsing questions as whether the keyword is within quotation marks or part of a user-defined name.

https://www.drupal.org/docs/develop/coding-standards/list-of...

https://github.com/AzisK/readsql/blob/master/readsql/regexes...

Thank you for the feedback and for the link. This is something I also do for SQL code. Initially I was making this to be used as a pre-commit hook for SQL code inside Python for our team. Probably inspired by the black Python formatter. I just made the MVP and will propose to our team to use it after the holidays. If regexes would seem to be not enough, we still have the power of Python to lend a hand for more complex puzzles