Hacker News new | ask | show | jobs
by thesz 123 days ago
SQL is not a pipeline, it is a graph.

Imagine three joins of three queries A,B and C, where first join J1 joins A and B, second join J2 joins A and C and third join J3 joins J1 and J2. Note that I said "queries," not "tables" - these A, B and C can be complex things one would not want or be able to compute more than once. Forget about compute, A, B and C can be quite complex to even write down and the user may really do not want to repeat itself. Look at TPC-DS, there are subqueries in the "with" sections that are quite complex.

This is why pipeline replacements for SQL are more or less futile efforts. They simplify simple part and avoid touching complex one.

I think that something like Verse [1] is more or less way to go. Not the Verse itself, but functional logic programming as an idea, where you can have first class data producers and effect system to specify transactions.

[1] https://en.wikipedia.org/wiki/Unreal_Engine#Verse

4 comments

TIL about Verse looks cool I'll have to check it out.

> SQL is not a pipeline, it is a graph.

Maybe it's both? and maybe there will always be hard-to-express queries in SQL, and that's ok?

the RDBMS's relational model is certainly a graph and joins accordingly introduce complexity.

For me, just as creators of the internet regret that subdomains come before domains, I really we could go back in time and have `FROM` be the first predicate and not `SELECT`. This is much more intuitive and lends itself to the idea of a pipeline: a table scan (FROM) that is piped to a projection (SELECT).

Pipeline is a specific kind of a graph.

Yes, there will always be hard-to-express queries, the question is how far can we go?

Thanks, I'll check out Verse.

I haven't seen anyone make the point about graphs before. FWIW PRQL allows defining named subqueries that can be reused, like J1 and J2 in your example.

Crazy to think that Fortnite might unleash a new population of people who toyed with functional-logic as their first paradigm.
Does it really help to call SQL a graph?
right? like it's a graph and a relational model query and a pipeline and a language and an abstract syntax tree and declarative logical plan
It does. Just like any other programming language.
May as well call everything a graph at that point; meaningless.

  > meaningless.
No.

You present "programs are graphs" as trivial truth. True trivial truths are, as you pointed out, meaningless. But you leave out degree of applicability - information in the dependence graph differs between programming languages.

Dependencies form a graph, and analyses needed to optimize execution of the program graph differ wildly between languages. Look at ะก++ aliasing rules and C's "restrict" keyword.

One can't escape the dependence graph. But one can execute dependence graph better or worse, depending (pun intended) on the programming language.