Hacker News new | ask | show | jobs
by dools 5093 days ago
but I really hate ORM’s and this was just a giant nuisance to deal with

I like object relational mapping as a theory (ie. I have an object of type Author which has 1 or more books I can loop over), but I hate ActiveRecord implementations. Eventually, they just end up implementing almost all of SQL but in some arcane bullshit syntax or sequence of method calls that you have to spend a bunch of time learning.

I also seriously doubt that anyone has ever written a production system of any reasonable complexity and been able to use the exact same ORM code with absolutely any backend (if you have an example please correct me on this). This barely even works with something like PDO in PHP which is a bare bones abstraction across multiple SQL backends.

When it comes down to it, the benefits of ActiveRecord are all but dead on about the third day of development. The data mapper pattern adopted by SQLAlchemy (et. al.) takes all of the shitness of ActiveRecord and adds mind bending complexity to it.

SQL is easy to learn and very expressive. Why try and abstract it?

I spent years working with an ActiveRecord ORM I wrote myself in my feckless youth and thought that it was the answer to the world's problems. I didn't really understand why it was so terrible until I did a large project in Django and had to use someone else's ORM.

When I really analysed it, there were only three things that I really wanted out of an ORM:

1) Make the task of writing complex join statements a bit less tedious

2) Make the task of writing a sub-set of very basic where clauses slightly less tedious

3) Obviate the need for me to detect primary key changes when iterating over a joined result set to detect changes in an object (for example, looping over a list of Authors and their Books)

To that end, I wrote this:

https://github.com/iaindooley/PluSQL

It's written in PHP because I like and use PHP but it's a very simple pattern that I would like to see elaborated upon/taken to other languages as I think it provides just the bare minimum amount of functionality to give some real productivity gains without creating a steep learning curve, performance trade-off or any barrier to just writing out SQL statements if that's the fastest way to solve the problem at hand.

4 comments

> I also seriously doubt that anyone has ever written a production system of any reasonable complexity and been able to use the exact same ORM code with absolutely any backend (if you have an example please correct me on this).

You're entirely right here, because databases are different. For example, (I forget the exact details), "select count(*)..." in MySQL is O(1), but it's O(log n) or O(n) in Postgres, depending on indices. That's a detail no ORM is going to save you from.

> SQL is easy to learn and very expressive.

Strongly disagree. The reason everyone keeps trying to write ORMs is because 1) SQL is a shitty language and 2) it's not the language that programmers want to use. Write a better frontend language for Postgres, and the ORMs would disappear.

I strongly suspect that would take some of the wind out of the NoSQL crowd. There are certainly NoSQL deployments that would have a hard time on traditional RDMBS, but there are a lot of other places that use Mongo just because they don't like SQL-the-language, rather than Postgres-the-DB.

No, actually the only reason is "its not a language programmers want to use".

It is very much non shitty.

Its just that lots of programmers, especially OO minded cannot get into its mindset, and use it for what it is, they have to put a lame OO abstraction on top.

Functional programmers shoud fare better in this regard (or Prolog programmers, if they still exist).

If you really want to abstract it, something like LINQ is a better way.

I agree. I see SQL similarly to regular expressions. There's a handful of commands which let you do a lot of stuff.

The hard part in SQL is optimization which requires really understanding how the underlying database engine optimizes and executes the query.

Optimizing complex queries is no joke. It's one of the reasons noSQL seems nice at glance. You can do the optimizations by adding lots of indexes or using application logic. In reality, it's a tradeoff for other problems.

ORMs aren't about saving one the need to know how to write SQL, they are about automating the task of constructing all the redundant SQL needed to marshal data to and from domain objects, as well as all the redundant automation involved in the actual marshaling of data from database client libraries into domain objects.

I think you are exaggerating quite a bit when you refer to SQLAlchemy's patterns adding "mind-bending complexity". Object relational mapping is a complex affair to start with. Have you much experience with modern versions of SQLAlchemy directly (and if not, how fair are comments like that) ?

All I want from an ORM is to manage caching intelligently. I'll learn some arcane bits to assist the ORM's pathfinding, but I simply can't imagine ORM's strong suits being in writing less tedious queries. Granted, the less-strokes and less-verbose nature of an ORM query is still a nice benefit.

    There are only two hard things in Computer Science:
    cache invalidation and naming things.

    -- Phil Karlton
The simplification of writing join queries where the primary key relationships are obvious also has the nice side effect of meaning that if you change relationships between entities (for example from 1-to-many to many-to-many the code won't have to be changed.

At any rate what I'm saying really is that reducing the amount of keystrokes writing and maintaining joins is the only part of SQL where I see there can be significant gains in productivity through automation of the task.

Most ORMs implement where clauses, from clauses, aggregate functions, grouping, having, etc. etc. etc. ie. they wind up basically re-implementing SQL and abstracting it so that your previous knowledge of SQL is basically obsoleted and in order to debug problems or create complex queries you either have to switch entirely to SQL (in which case you lose all query building functionality) or map in your head the SQL you want to achieve, to the arbitrary syntax provided by the ORM software.

I've found SQLAlchemy to be very nice, actually. It first provides very basic abstractions on top of SQL, things like defining tables/columns and querying without having to mess around with strings.

That alone is most of the usefulness of SQLAlchemy, as it lets you write subqueries and joins extremely easily.

On top of that, the (optional) ORM is built as models on top of SQLAlchemy's table/relationship API. These models can be queried almost exactly like the raw tables.