Hacker News new | ask | show | jobs
by ww520 1409 days ago
Having is the where-clause for Group By. It's easier to understand by thinking the SQL query as a pipeline.

Stage 1: From returns the whole world of rows.

Stage 2: Where filters down to the desired set of rows.

Stage 3: Group By aggregates the filtered rows.

Stage 4: Having filters again on the aggregated result.

Stage 5: Select picks out the columns.

2 comments

What I never understood is why HAVING and WHERE are different clauses. AFAIU, there are no cases where both could be used, so why can’t one simply use WHERE after a GROUP BY?

(I know that I am probably missing some important technical points, I would like to learn about them)

It doesn't improve the power of SQL, it's just syntactic sugar. Because of how the SQL syntax works, you'd have to do something like:

   WITH A AS (
        SELECT x, sum(y) AS z
          FROM SomeTable
      GROUP BY x
   )

   SELECT * FROM A WHERE z > 10
With an HAVING clause you can instead just tuck it after the GROUP BY clause.

Also, although it's not an issue these days given how good query planners are (any decent engine will produce exactly the same query plan with a subquery or an having clause, it's indexes that fuck up stuff), but you're signaling that the filter happens "at the end".

It's like having both "while" and "for" in a programming language. Technically you don't need it, but it's for humans and not for compilers.

    select product
         , sum(price) as price 
      from table
     where price<1000
    having price>10000

You can refer to the aliased price column before or after aggregation using where or having. Depending on the sql engine.
I didn’t know one could still refer to the value before aggregation after it is aliased.

Is there an implicit group by in this query?

This is non-standard and highly dependent on the sql engine. If you believe in portability, your where clauses should (sadly) use the long form.

If that's not a requirement, this approach can add some clarity

There's plenty of cases - remove the people who have attribute, then aggregate them, then filter the aggregate.

Find me the list of non-deleted users who have more than 50 dollars worth of transactions in the transaction table.

Technically you can always subquery and use a where instead of a having but its nice to ... have.

I have pretty often cases where I use where and having. Where filters the results before the group and having afterwards.