Hacker News new | ask | show | jobs
by btilly 1759 days ago
Meh, summarizing, grouping, etc aren't that hard.

However WINDOW queries definitely have a learning curve. Not the least because useful examples almost always require you to use a nested query.

2 comments

I learned SQL on a need to know basis. For me, recursive queries were the ones that needed the most time to click.

Another one that caught me by surprise was NULL vs unknown[1]. That bit me in a couple of queries.

[1]: https://learnsql.com/blog/understanding-use-null-sql/

The problem with NULL in SQL is that the semantics are inconsistent in complex ways. See for example https://vettabase.com/blog/what-does-null-mean-in-sql/.
Recursive queries are definitely top of the complexity pile IMO. However I usually discount them because I am yet to actually need them in a production environment. Window functions are super useful on the other hand.
Heh, perhaps it's more accurate to say I've abused recursive queries.

One instance was to turn a column containing comma separated values into rows[1], so I could join on them. Wasn't for a query that needed performance of course.

[1]: https://news.ycombinator.com/item?id=28020321

I had this same issue in Redshift and ended up populating a table with values 1 to the maximum number of commas found (e.g. using max(regexp_count(...)) or something), then cross joining on the table with the csv column and calling split_part on the corresponding column and index (with the index coming from the numbers table). The cross join ensures that you index every value of the csv column.
What I think you'd do here (in modern implementations of SQL at least) is to split the values into an array using some convenient split function and self join on that array to pivot the arrays into rows.
> some convenient split function

If only the DB we're using had one of those :)

A lot of people try to do stuff that doesn't actually make sense when it comes to groups. Like selecting a column which isn't contained in the group by. And they are confused by the error but when you talk them through it "What did you actually want to see? There are multiple values for this column now" it starts to become clear to them.