Hacker News new | ask | show | jobs
by wodenokoto 1055 days ago
I disagree. I find it extremely hard to reason about large queries as set transformations, whereas it is much easier to break it down to "first this, then that". And this is long before I've even started writing my first line of SQL.

So let me write it procedurally and have the optimization engine fix it for me, just like how it fixes my SQL.

Even SQL queries are often better understood procedurally. Take this one [1]:

    SELECT article, dealer, price
    FROM   shop s1
    WHERE  price=(SELECT MAX(s2.price)
                  FROM shop s2
                  WHERE s1.article = s2.article)
    ORDER BY article;
That inner WHERE clause doesn't make sense in my opinion, unless you think of it procedurally as for each row in s1, ask do a search for the highest price amongst all items that share article number.

[1] https://dev.mysql.com/doc/refman/8.0/en/example-maximum-colu...

4 comments

Completely agree, thanks for putting it better than I could have, with an excellent example. Correlated subqueries like the example you give, or similarly lateral joins in postgres, fundamentally are treated like for loops by DB engines anyway.

Semi-related, but the example you give is also why I love Postgres' "DISTINCT ON" functionality (I don't know if other DBs have something similar) - it makes it so much easier to reason about these "give me the 'first' one from each group" type queries without having to resort to correlated subqueries.

queries like these are best suited for window functions, although I am not sure Mysql supports it:

  SELECT article, dealer, price FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY article ORDER BY price DESC) as rnk
    FROM   shop s1
  ) sub 
  WHERE sub.rnk=1
  ORDER BY article; 
this query will be a single pass over table without loops/joins
This is the "set based" approach for the MAX: there does not exist a bigger element:

  SELECT article, dealer, price
    FROM   shop s1
    WHERE  NOT EXISTS (SELECT 1 FROM shop s2 
                         WHERE s2.price > s1.price AND
                               s2.article = s2.article)
    ORDER BY article;
Unpopular opinion.

The uncorrelated example should have been rewritten with a CTE and should have been aliased as 'article_max_price' as if it was a computed property and where price = amp.price