| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by maximilianroos 1050 days ago

[PRQL dev here]

I agree with the sentiments, even if not the conclusion. SQL is omnipresent and is "fine" in a lot of cases.

TypeScript is indeed a great example of the case; Kotlin too. I'd also add that databases are already adding PRQL support — ClickHouse has native support, there's a DuckDB extension, and folks are working on a Postgres extension.

One thing I'll respectfully disagree with — "SQL is highly statically analyzable by nature":

As a really basic example: `SELECT <expr> FROM tbl` — can we tell what shape the result is? In SQL, shapes / types require a lot of context — the result could be a single row in the case of `SUM(foo)`, or it could be every row in the case of `foo, bar`. More in https://prql-lang.org/faq/...

4 comments

dragonwriter 1050 days ago

> As a really basic example: `SELECT <expr> FROM tbl` — can we tell what shape the result is?

Like most statically analyzable code, you have problems if you try to statically analyze part of the code without the relevant definitions.

But, yes, if you have the relevant code (e.g., the DDL for the table), you can. (Without it, you can in the sense that you can statically determine it as a function of the table definition, which can be sufficient in some cases.)

> in SQL, shapes / types require a lot of context — the result could be a single row in the case of `SUM(foo)`, or it could be every row in the case of `foo, bar`.

Oh, you mean, can we statically determine the shape of an expression’s results without knowing the expression? Well, no, and that’s true in most statically-analyzable languages.

link

lolinder 1050 days ago

Can you tell the shape of the result of `<expr>()` in Java by just reading it? No. Does that mean that Java can't be statically analyzed? Of course not!

A static analysis system is not restricted to weird abstract constructs like `select <expr>`, it sees the complete picture and can come to conclusions based on the concrete code construct it's given. There's absolutely nothing stopping a SQL static analysis from recognizing that `SELECT sum(foo)` will always return one row with one column that is an integer type, while `SELECT foo, bar` returns some number of rows that have foo and bar columns whose types can be inferred from the CREATE TABLE statements.

link

maximilianroos 1050 days ago

Again — we agree — with omniscience, it's possible to statically analyze everything. My claim is some languages can be statically analyzed with much less context than others.

Check out "Expression substitution" at https://www.scattered-thoughts.net/writing/against-sql/ if you're interested in more here.

link

Too 1049 days ago

Having some static constraints around <expr> makes it a lot easier to analyze though. Both for humans and compilers.

Otherwise you end up with situations like C++ templates, that can expand to anything. With long confusing errors, unable to distinguish if it’s an error in the usage, the definition or the call-site.

link

chrisjc 1047 days ago

> One thing I'll respectfully disagree with — "SQL is highly statically analyzable by nature"

Are you suggesting that PRQL is capable of this? Or at least easier to do in PRQL?

link

maximilianroos 1046 days ago

Yes, much easier.

Check out "What’s this aggregate function?" at https://prql-lang.org/faq/. Without much context, we can understand the shape of a result.

And because queries can be longer without becoming unreadable, the lineage information is richer.

link

edgyquant 1049 days ago

If you know <expr> then yes you know the shape of the data. You may not know the number of rows, but I fail to see how that makes it not analyzable.

link

maximilianroos 1049 days ago

Quoting from https://www.scattered-thoughts.net/writing/against-sql/:

---

There are many cases where a small change to a computation requires totally changing the structure of the query, but subqueries are my favourite because they're the most obvious way to express many queries and yet also provide so many cliffs to fall off.

    -- for each manager, find their employee with the highest salary
    > select
    >   manager.name,
    >   (select employee.name
    >    from employee
    >    where employee.manager = manager.name
    >    order by employee.salary desc
    >    limit 1)
    > from manager;
     name  | name
    -------+------
     alice | bob

---

If the inner query has more than one row, the query will raise an error. That's difficult to know from the SQL alone.

link

lolinder 1049 days ago

Difficult for the author, but not for static analysis. A static analysis system can pretty easily enforce that `limit 1` be set when a query is used in that position.

link