Hacker News new | ask | show | jobs
by ninetyninenine 661 days ago
orms are exercises in OCD.

Databases are the bottleneck your classic website. We choose to query these databases in a extremely high level language called SQL. This language is so high level that you need to come up with tricks and query analyzers in order to hack the high level query into something performant.

A better abstraction would be one that's a bit more similar to a standard programming language with an std that has query related operations/optimizers that can be composed so programmers can compose and choose query operations and avoid optimization issues that are hidden by high level languages like SQL.

We are unfortunately, sort of stuck with SQL (there are other options, but SQL remains popular because years of development has made it pretty good in spite of the fact that it's a poor initial design choice). This is already a historical mistake that we have to live with. Same with javascript (which has been paved over with typescript), same with CSS, etc. The web is full of this stuff. It's fine. My main problem is the ORM.

The ORM is just another layer of indirection. You already have a high level language you're dealing with, now you want to put Another High level language on top of it? ORMs are basically high level languages that compile into SQL.

What is the point? The ORM isn't actually making things easier because SQL is pretty much as high level of a language you can get outside of having an LLM translating direct english.

The point is OCD. Programmers find it jarring to work with SQL strings inside their beautiful language so they want to chop up a SQL string into web app language primitives that they can compose together. Query builders operate on the same concept but are better because they aren't as high level.

This is basically the main reason why Many programmers experience the strange counter intuitive phenomena about why ORMs actually makes things harder. You have to Hack SQL to optimize it. Now you have to hack another ORM on top of it in order to get it to compile it into the hacked query.

2 comments

You are correct in your entire assessment, yet, you seem to underestimate the number of boring CRUD applications serviced by mediocre programmers. Limiting the number of technologies required to have at least a little knowledge in is a benefit, even if it hampers performance, because performance doesn’t matter for the vast majority of cases. Software engineers tend to overvalue that bit; the users of line-of-business apps don’t have a say anyway, but what matters is quick adaptability to changes in business logic.

So having an ORM in place that is tightly integrated in Python and Django lets even the junior developer fresh from the bootcamp make changes to an existing application.

It’s not a pretty story, but in my opinion the reason for staggering layers of complexity is the ability to move quickly even without experts on the team.

I had to scroll all the way down to find this. As a CTO who personally never understood the point of ORMs the benefits become very quickly obvious when your organization starts to scale and the prospect of dozens or hundreds of developers of unknown quality hitting your production db with raw sql becomes objectively frightening. Of course there are ways to setup and administer your db to prevent the most obvious footguns, and it is still possible to write bad queries with an ORM, but having that extra layer with limitations gives some extra peace of mind.
> This language is so high level that you need to come up with tricks and query analyzers in order to hack the high level query into something performant.

What? You have to understand a language to write performant code in it. That’s not a hack, that’s basic competence.

> What? You have to understand a language to write performant code in it. That’s not a hack, that’s basic competence.

The poster is not referring to understanding the language, he/she is referring to having to guess at how to structure the query in a way that increases the chances that a good plan is within the constrained search space of plans (due to it being a combinatorial problem and the optimizer has limited time and information).

No it's not. Many language are designed so certain performance aspects are invisible. Take golang and python: The garbage collector.

The garbage collector is something you should not think about when programming and most programmers don't even need to know it exists. You only think about it when you need to seriously optimize things.

For SQL it's a bad design choice because of what SQL is and what SQL is targeting.

SQL is a high level abstraction that automatically determines a query plan based off of high level input. The problem here is that it can choose a bad query plan. So you need to "hack" the query in order to trick the query planner into doing what you want.

It's also targeting the slowest part of the stack: Non volatile memory and IO. The slowest part of the stack should be targeted with a zero cost abstractions to maintain speed while the fastest part of the stack you can use a language like python for your web app it's fine because databases are magnitudes slower.

Point taken re: gc.

For query plans, in nearly every plan flip I’ve encountered, the root cause came down to either poor table / column statistics (because they had let the table grown too large without appropriate tuning), having an enormous amount of joins beyond the deterministic level the planner could provide, or sub-optimal queries. The latter is growing less common as planners get better and better at rewriting on the fly, but it’s always good to know how to do things optimally.

A good example is semijoins / antijoins. With modern versions of both MySQL and Postgres, you’ll probably have “WHERE foo IN (…)” turned into “WHERE EXISTS (SELECT 1…)”, but it’s better to write it the optimal way in the first place.

I will grant you that indexing can be bewildering, with its many rules, caveats, and gotchas.