Hacker News new | ask | show | jobs
by alquemist 2076 days ago
There are a handful of examples on Wikipedia: https://en.wikipedia.org/wiki/QUEL_query_languages. One example:

    retrieve (a=count(y.i by y.d where y.str = "ii*" or y.str = "foo"), b=max(count(y.i by y.d))) 
Not a particularly clear 'jumps at you' obvious semantic:

* Are a and b aggregation functions or window functions? If aggregations, how do they compose if the 'by' scopes are different?

* What does max(count(... by ...)) mean? What is the aggregation (window?) scope of max?

* How would an outer where clause compose? What is the evaluation order?

1 comments

> The following table lists aggregate functions:

> count() Number of entries in column

> max() Maximum value in column

> The by clause causes the function to return a set of results, as opposed to a single result. One result is returned for each grouping specified by the by clause. Think of by as meaning "for each."

I assume it evaluates like retrieving set and scalar.

      a   |   b
    --------------
    set 1 | scalar
    set 2 | scalar

[1] http://docs.huihoo.com/ingres/9.3/QUELRef.pdf
Assuming we key a and b by y.d and generalizing b to sets, we are reaching the limits of the relational model: there is no good way to represent multiple sets associated with a given key, we need independent tables to do so. Neither cross product nor null padding is a good way to represent the schema {y, a[], b[]}.

      y.d |  a   |  b
    -------------------
      y0  |  a0  |  b0
      y0  |  a0  |  b1
      y0  |  a1  |  b0
      y0  |  a1  |  b1
      y0  |  a2  |  b0
      y0  |  a2  |  b1

    
      y.d |  a   |  b
    -------------------
      y0  |  a0  |  b0
      y0  |  a1  |  b1
      y0  |  a2  |  null

    # create table y (i int, d int);
    # insert into y values (1, 1), (1, 2), (2, 1);
    # select *, (select count(*) from (select count(y.i), y.d from y group by y.d) _) as foo
      from (select count(y.i), y.d from y group by y.d) _ ;
    
     count | d | foo 
    -------+---+-----
         1 | 2 |   2
         2 | 1 |   2
    (2 rows)
    
By the way great example how unwieldy SQL is. A bit better with CTE:

    # with bar as (select count(y.i), y.d from y group by y.d)
      select *, (select count(*) from bar) as bar from foo;
Nice! Should be easy to do max instead of count. Then top instead of max. Then top3 breaks the model.
Yes, should be max(count). I've lost you with top3.

    # select * from
      (select count from (select count(y.i), y.d from y group by y.d) _ limit 2) b,
      (select count(y.i), y.d from y group by y.d) a;
     count | count | d 
    -------+-------+---
         1 |     1 | 2
         2 |     1 | 2
         1 |     2 | 1
         2 |     2 | 1
    (4 rows)
top - max value in the group

top3 - max 3 values in the group