Hacker News new | ask | show | jobs
by chrstr 1197 days ago

  create table x as (select * from person);
  select name from x where ...;
there you go, just configure your editor to display "create table x" as "declare x = " ;)

or even a version with lazy evaluation:

  create view x as (select * from person);
  select name from x where ...;
2 comments

You're still not getting it. First-class status means that anywhere a value or variable can be used, a query or table should also be able to appear, and vice versa. This means a table or query can appear as a return type, a parameter to a stored procedure or query, a variable, and so on.

SQL just does not have this, it instead has 15 different second class ways to handle tables and queries that try to make up for the fact that they are not first-class values. These include CTEs, table valued functions, views, etc.

Usually I balk at the idea of of someone describing a language feature as “first class” because it seems to wishy washy. But in this thread you’ve shown me that maybe the best definition is through “proof by negation,” by patiently responding to arguments and demonstrating why a certain usage and the ensuing restriction around it means it is not first class. Bravo!
I agree the term is often abused, but I think the wikipedia page actually does a decent job of making the notion somewhat precise, along the lines I've been arguing here:

https://en.wikipedia.org/wiki/First-class_citizen

If you want to see what queries as first-class values looks like, LINQ in .NET is pretty close. I can actually write a series of queries that build on and compose with each other, like this:

    IQueryable<Person> RunQuery(int userSelection)
    {
        var first = from x in People
                    select x;
        var second = userSelection == 1
            ? from x in first where x.Birthday > '2000-01-01' select x
            : from x in first where x.Name.Contains("Jane") select x;
        return DumbJoin(first, second);
    }

    IQueryable<Person> DumbJoin(IQueryable<Person> first, IQueryable<second>)
    {
        return from x in second
               join y in first on y.Role equals x.Role into g
               select g;
    }
This query is nonsense, but it just shows you what composition really looks like when queries are first-class values. I wish raw SQL were like this!
> You're still not getting it. First-class status means that anywhere a value or variable can be used, a query or table should also be able to appear, and vice versa. This means a table or query can appear as a return type, a parameter to a stored procedure or query, a variable, and so on.

I doubt you could implement a query planner that would cope with that degree of flexibility. Which means you’d be forced to deal with the mechanics of the query, pushing you away from declarative SQL and into procedural and functional programming. At which point you might as well ditch SQL anyway.

Without these features, people have to resort to dynamically generated SQL queries in procedural or functional languages, which is much worse! SQL has also become significantly more complicated by adding all sorts of second-class features to get around this composability limitation (CTEs, table valued functions, views, etc.).

Besides, I don't think it would be as bad as you say. You can approach it as a simple template expansion into flat SQL queries except where a data dependency occurs, at which point template expansion proceeds in stages, one for each dependency.

LINQ on .NET provides most of the composability I'm talking about, although it has a few limitations as well. Still worlds better than raw SQL.

In PostgreSQL at least, a table can appear as a return type of a function and as a parameter to a function. That's not nothing.
What if I wrote a very long, complicated query that I'd like to test against different tables (like test tables), and let's say I can't use stored functions or procedures. How could I pass different tables to my query?

  CREATE TABLE data_a AS (SELECT 'a' AS test_case, 1 AS value);
  CREATE TABLE data_b AS (SELECT 'b' AS test_case, 2 AS value);
  CREATE VIEW data AS (SELECT * FROM data_a UNION ALL SELECT * FROM data_b);

  CREATE VIEW complicated_query AS (SELECT test_case, value+1 FROM data);

  SELECT * FROM complicated_query WHERE test_case = 'a';
  SELECT * FROM complicated_query WHERE test_case = 'b';
Nice, that is what I was looking for. Of course, it'd need to point to production data as well, so maybe test_case is null, in that case:

  CREATE TABLE data_a AS (SELECT 'a' AS test_case, 1 AS value);
  CREATE TABLE data_b AS (SELECT 'b' AS test_case, 2 AS value);
  CREATE TABLE data_prod AS (SELECT NULL AS test_case, prod_table.value FROM prod_table);

  CREATE VIEW data AS (SELECT * FROM data_a UNION ALL SELECT * FROM data_b UNION ALL SELECT * FROM data_prod);

  CREATE VIEW complicated_query AS (SELECT test_case, value+1 FROM data);

  -- when testing
  SELECT * FROM complicated_query WHERE test_case = 'a';
  SELECT * FROM complicated_query WHERE test_case = 'b';

  -- when in 'production'
  SELECT * FROM complicated_query WHERE test_case IS NULL;
You just reinvented defunctionalization, which is a transformation from a domain that has first-class values to a domain where support is only second-class. Defunctionalization is typically used in programming languages to simulate first-class functions in languages where functions are only second-class citizens, like C and Pascal.

This perfectly illustrates my point. You had to manually defunctionalize your data model and queries to support what I'm saying should be inherently part of SQL.

> languages where functions are only second-class citizens, like C and Pascal.

1) Only if you define Pascal as only Wirth's very first version. That changed almost immediately.

2) Only if you refuse to equate “pointer to function” with “function”. Which in C, where “everything is a pointer” (a bit like in Unix Linux “everything is a file”), seems rather silly.

Check out dbt - it's a great tool for organizing queries and solving such patterns
If you can't use stored procedures which are good for this very case, many databases offer dynamic SQL. That might work in some cases.
That's a good point. This would rule out SQLite for me.
Dynamic SQL isn’t SQL, and it’s not relational. It’s no different from using a language like Python to generate SQL queries.
It's a little different. Anyway, this is under the constraint "no stored procedures."
TVF ( Table-Valued Function) with Cross apply.