| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by alquemist 2076 days ago

There are a handful of examples on Wikipedia: https://en.wikipedia.org/wiki/QUEL_query_languages. One example:

    retrieve (a=count(y.i by y.d where y.str = "ii*" or y.str = "foo"), b=max(count(y.i by y.d)))

Not a particularly clear 'jumps at you' obvious semantic:

* Are a and b aggregation functions or window functions? If aggregations, how do they compose if the 'by' scopes are different?

* What does max(count(... by ...)) mean? What is the aggregation (window?) scope of max?

* How would an outer where clause compose? What is the evaluation order?

1 comments

sergeykish 2075 days ago

> The following table lists aggregate functions:

> count() Number of entries in column

> max() Maximum value in column

> The by clause causes the function to return a set of results, as opposed to a single result. One result is returned for each grouping specified by the by clause. Think of by as meaning "for each."

I assume it evaluates like retrieving set and scalar.

      a   |   b
    --------------
    set 1 | scalar
    set 2 | scalar

[1] http://docs.huihoo.com/ingres/9.3/QUELRef.pdf

link

alquemist 2075 days ago

Assuming we key a and b by y.d and generalizing b to sets, we are reaching the limits of the relational model: there is no good way to represent multiple sets associated with a given key, we need independent tables to do so. Neither cross product nor null padding is a good way to represent the schema {y, a[], b[]}.

      y.d |  a   |  b
    -------------------
      y0  |  a0  |  b0
      y0  |  a0  |  b1
      y0  |  a1  |  b0
      y0  |  a1  |  b1
      y0  |  a2  |  b0
      y0  |  a2  |  b1

    
      y.d |  a   |  b
    -------------------
      y0  |  a0  |  b0
      y0  |  a1  |  b1
      y0  |  a2  |  null

link

sergeykish 2075 days ago

    # create table y (i int, d int);
    # insert into y values (1, 1), (1, 2), (2, 1);
    # select *, (select count(*) from (select count(y.i), y.d from y group by y.d) _) as foo
      from (select count(y.i), y.d from y group by y.d) _ ;
    
     count | d | foo 
    -------+---+-----
         1 | 2 |   2
         2 | 1 |   2
    (2 rows)

By the way great example how unwieldy SQL is. A bit better with CTE:

    # with bar as (select count(y.i), y.d from y group by y.d)
      select *, (select count(*) from bar) as bar from foo;

link

alquemist 2074 days ago

Nice! Should be easy to do max instead of count. Then top instead of max. Then top3 breaks the model.

link

sergeykish 2074 days ago

Yes, should be max(count). I've lost you with top3.

    # select * from
      (select count from (select count(y.i), y.d from y group by y.d) _ limit 2) b,
      (select count(y.i), y.d from y group by y.d) a;
     count | count | d 
    -------+-------+---
         1 |     1 | 2
         2 |     1 | 2
         1 |     2 | 1
         2 |     2 | 1
    (4 rows)

link

alquemist 2074 days ago

top - max value in the group

top3 - max 3 values in the group

link