Hacker News new | ask | show | jobs
by oblio 1618 days ago
The biggest thing is... SQL is not reusable, period.

Why don't we have SQL libraries?

I know that data models are kind of special snowflakes, but some models pop up over and over and over again and code reuse is always 0 with SQL.

To give you an example of a common problem, SLAs or the like for teams with regular business hours.

A team has to respond to a request within N hours. To calculate that I need to take into account 8 business hours per day, excluding weekends, excluding holidays (ideally localized holidays), etc.

It's a nightmare with SQL. It's precisely the kind of thing you want in a library.

Plus, obviously, standard SQL doesn't have a way to share and distribute any libraries, even if they were made. It's pre-C in terms of stuff like that.

5 comments

And the core issue is, sql is just strings. It only fails at runtime and not a compile time. There is no compile-time strong/static typing, you only find out its broken when you run it. So it makes it really difficult to re-use. The other problem is, you have to specify table and column names in queries (duh, how else), but that means the entire persistence model is hard coded and need manual tweaking and so on. In an ideal universe, each field would be generic, each object generic and each collection of objects generic and all functions being generic. You'd end up with one huge flat structure of fields that are linked to each other to form more complex objects. But that not efficient in current computers nor do we have enough memory to store all information like that. And you might end up back at square one where everything is just strings.
> The other problem is, you have to specify table and column names in queries (duh, how else), but that means the entire persistence model is hard coded and need manual tweaking and so on. In an ideal universe, each field would be generic, each object generic and each collection of objects generic and all functions being generic.

Yeah, define the database in a database. And then define the database for the definition of the database in a database.

If you'd like to deliver a product that does something, you've got to stop adding abstraction layers at some point.

The thing is, for sure we could define some use cases where it's possible to take that general case and make it specific. That's why I gave a concrete example.

I refuse to believe that absolutely every data storage and access in this world is unique.

Everybody believe that it's unique, and that's a different story.

OS vendors also thought their hardware was special and magical a long time ago and yet POSIX was invented and suddenly they were all more or less commoditized.

I feel that we're in the teen years of data storage/data access technologies. And SQL is sort of like dental braces.

A long long time ago I sent in a patch to Hibernate to check the validity of all declared HQL queries.

There's no reason why it can't be checked, you just need to have your schema declaration.

Name clashes are everywhere. That's why you have namespaces/packages. In SQL they are called schemas. People don't really use them these days.

The tweaking is one of the great things. If you're hardcoding queries (not sql), you're actually defining the order of operations etc. A query analyzer will use statistics, and you can hint how queries have to be executed, depending on the shape of your data.

Your code is also just strings, until you compile it. Actually, these days until you run it. Your arguments would make more sense in the 90s where people would actually compile code.

You ideal universe is actually "The Inner Platform Effect". Better let pgsql be the data platform ;-)

> Your code is also just strings, until you compile it. Actually, these days until you run it. Your arguments would make more sense in the 90s where people would actually compile code.

His arguments made sense in the 90s, and amusingly, post 2015.

Your argument made sense in the 00s and before 2015.

Swift is compiled (Apple platforms; Objective-C has always been compiled).

Rust is compiled (multiple platforms).

Typescript is compiled (so web/Javascript).

Kotlin is compiled (Android; Java has always been compiled).

C/C++ were always compiled (POSIX; Windows).

C# was always compiled (Windows; POSIX).

Almost every modern language is compiled and if it's not, it's getting a very solid static analysis step that for sure you want to have and run (PHP got types a while back, Python is getting them, Ruby is getting them).

The other problem is, you have to specify table and column names in queries (duh, how else), but that means the entire persistence model is hard coded and need manual tweaking and so on.

This resonates with me If I understand you correctly, with RDBMS/SQL, the structural decisions you make in the database to represent your data "poison" your application making it difficult to change over time.

The data model is the code reuse.

You can model business hours and SLAs with relationships. Join on time and team.

  SELECT support_request.request_time + team.response_sla AS respond_by_time
  FROM support_request 
  JOIN team_sla 
    ON support_request.assigned_team = team_sla.team_id
   AND DATE_PART('day', support_request.request_time) = team_sla.day_of_week
   AND DATE_PART('hour', support_request.request_time) BETWEEN team_sla.start_hour AND team_sla.end_hour;
It's quite impressive how you've missed the core of my message.

Now bundle up your proposal, put it up on Github, license it as MIT, and publish it available on sqlpm.org (SQL Package Manager) so that I can re-use it.

What's that you say? I can't? There's no sqlpm.org? Not even a postgresqlpm.org?

Where's the SQL ecosystem?

Oh, wait, there isn't any because SQL is not really reusable. It's <<all>> one-off scripts, like back in the Dark Ages of software development.

I always install https://github.com/awslabs/amazon-redshift-utils/tree/master... on my Redshift clusters.

What you are describing as code reuse exists for databases, but they are called applications and generally utilize a general purpose programming language. It doesn’t make sense to have a SLA data model library because every use case is different. It’s a database, not procedural code.

You must have not seen enterprise apps.

There's a reason monstrosities like SAP exists, they're practically what you describe.

If stuff like SAP is the future of Line Of Business (LOB) apps, instead of having a rich Open Source ecosystem of data storage and data access libraries, then we've lost.

We're locked in the trunk.

SQL is not modular (the root cause of why it’s hard to reuse) because it’s declarative. If the underlying data models stay relatively stable, you can reuse code with certain assumptions. There’s a trade off here for sure.
No that's not why it's not modular. If that was the case, then other relational algebra-based query languages would suffer the same problem. They do not.
SQL can be somewhat reusable through views. Also in your code you can make reusable functions that contain SQL queries. You won't get the best performance reusing the queries this way, but you can put them into a transaction and build up more complex flows from reusable pieces that way
Postgis and many other libraries exist. They are widely used.
Is Postgis implemented in SQL?
That's how I remembered it from back in early days. There was some pl/sql you had to execute, and you'd have some extensions. I don't think it's like that anymore