Hacker News new | ask | show | jobs
by superasn 319 days ago
Always use ORMs and then spend the next year debugging N+1 queries, bloated joins, and mysterious performance issues that only show up in prod.

Migrations randomly fail, schema changes are a nightmare, and your team forgets how SQL works.

ORMs promise to abstract the database but end up being just another layer you have to fight when things go wrong.

25 comments

People love to rant about ORMs.

But as someone who writes both raw SQL and uses ORMs regularly, I treat a business project that doesn’t use an ORM as a bit of a red flag.

Here’s what I often see in those setups (sometimes just one or two, but usually at least one):

- SQL queries strung together with user-controllable variables — wide open to SQL injection. (Not even surprised anymore when form fields go straight into the query.)

- No clear separation of concerns — data access logic scattered everywhere like confetti.

- Some homegrown “SQL helper” that saves you from writing SELECT *, but now makes it a puzzle to reconstruct a basic query in a database

- Bonus points if the half-baked data access layer is buried under layers of “magic” and is next to impossible to find.

In short: I’m not anti-SQL, but I am vary of people who think they need hand-write everything in every application including small ones with a 5 - 50 simultaneous users.

People who avoid ORMs endup writing their own worse ORM*. ORMs are perfect if you know how and when to uses them. They encapsulate a lot of the mind numbing work that comes with raw sql such as writing inserts for a 50 column database.
100%. I once tried to optimize a SQL query, moving away from the ORM, so I can have more control of the query structure and performance.

I poorly implemented SOLID design principles, creating a complete mess of a SQL Factory, which made it impossible to reason about the query unless I had a debugger running and called the API directly.

I find that Claude writes boilerplate SQL very well, and is effectively an 'ORM' for me - I just get plain SQL for CRUD.

Complex queries I write myself anyway, so Claude fills the 'ORM' gap for me, leaving an easily understood project.

Writing is just half the job. Now try migrations, or even something as fundamental as ”find references” on a column name. No, grep is not sufficient, most tables have fields called ”id” or ”name”.
I did that once on a hobby project, accidentally. When I realized the corner I had painted myself into I abandoned it.
I'd say, pure SQL gives you a higher performance ceiling and a lower performance and security floor. It's one of these features / design decisions that require diligence and discipline to use well. Which usually does not scale well beyond small team sizes.

Personally, from the database-ops side, I know how to read quite a few ORMs by now and what queries they result in. I'd rather point out a missing annotation in some Spring Data Repository or suggest a better access pattern (because I've seen a lot of those, and how those are fixed) than dig through what you describe.

The best is when you use an orm in standard ways throughout your project and can drop down to raw sql for edge things and performance critical sections… mmmmm. :chefs kiss:
100%

If a dev thinks that all SQL can be written by hand then they probably haven’t worked with a complex application that relies on complex data.

A good question to ask them is: what problems do ORMs solve? Good answers are:

Schema Changes + migration

Security

Code Traceability (I have a DB field, where is it used)

Code Readability

Standardisation - easy hiring.

Separation of data layer logic and application layer logic.

Code organisation, most ORMs make you put methods that act on a table in a sensible place.

I like Django's ORM for good schema migration. Other "ORMs" people build do not often have a good story around that. So often it's because developers aren't experiencing the best ORMs they could.
Homegrown ORMs are universally terrible and a lot of the anti ORM crowd are really anti homegrown ORM.

I’ve used Django, SQLalchemy and Hibernate. All three have good migration stories.

I think people should go all-in on either SQL or ORMs. The problems you described usually stem from people who come from the ORM world trying to write SQL, and invariably introducing SQL injection vulnerabilities because the ORM normally shields them from these risks. Or they end up trying to write their own pseudo-ORM in some misguided search for "clean code" and "DRY" but it leads to homegrown magic that's flaky.
In Java, the sweet spot is JDBCTemplate. Projects based on JDBCTemplate succeed effortlessly while teams muddle through JPA projects.

It is not that JPA is inherently bad, it's just that such projects lack strong technical leadership.

I believe jOOQ is Java's database "sweet spot". You still have to think and code in a SQL-ish fashion (its not trying to "hide" any complexity) but everything is typed and it's very easy to convert returned records to objects (or collections of objects).
Sir, sqlc for example.

I know exactly what's going on, while getting some level of idiocy protection (talking about wrong column names, etc).

Poor developers use tools poorly, film at 11.

But seriously, yeah, every time I see a complaint about ORMs, I have to wonder if they ever wrote code on an "average team" that had some poor developers on it that didn't use ORMs. The problems, as you describe them, inevitably are worse.

ORMs can be the starting point to optimize the queries when they need it manually with SQL.

There's also the reality that no two ORMs may be built to the same way and performance standard.

I'm wary of people who are against query builders in addition to ORMs. I don't think it's possible to build complicated search (multiple joins, searching by aggregates, chaining conditions together) without a query builder of some sort, whether it's homegrown or imported. Better to pull in a tool when it's needed than to leave your junior devs blindly mashing SQL together by hand.

On the other hand, I agree that mapping SQL results to instances of shared models is not always desirable. Why do you need to load a whole user object when you want to display someone's initials and/or profile picture? And if you're not loading the whole thing, then why should this limited data be an instance of a class with methods that let you send a password reset email or request a GDPR deletion?

At least when I see raw sql I know me and the author are on a level playing field. I would rather deal with a directory full of sql statement that get run than some mysterious build tool that generates sql on the fly and thinks its smarter than me.

For example, I'm working on a project right now where I have to do a database migration. The project uses c# entity framework, I made a migration to create a table, realized I forgot a column, deleted the table and tried to start from scratch. For whatever reason, entity framework refuses to let go of the memory of the original table and will create migrations to restore the original table. I hate this so much.

You can use EF by writing the migrations yourself ("database first"). Also, whatever problem you have there seems to be easily fixed either by a better understanding of how EF's code generation works, or by more aggressive use of version control.
Their point is they understand sql and dbs. They shouldnt need to learn EF and all its footguns.
i think you should just create another migration with alter table that adds that column
all of this is solved by an sql query builder DSL
> - Some homegrown “SQL helper” that saves you from writing SELECT *, but now makes it a puzzle to reconstruct a basic query in a database

>- Bonus points if the half-baked data access layer is buried under layers of “magic” and is next to impossible to find.

It’s really funny because you’re describing an ORM perfectly.

I don't know what kind of ORM you have used but I probably wouldn't like it either.

My ORM does extremely much more than those "SQL helper" classes and it logs SQL nicely to the console or wherever I ask it to to log.

And it is easy to find it, just search for @Entity.

They're making the tongue-in-cheek observation that those who don't use an ORM end up reinventing one, poorly.
A bad ORM. Every application that accesses an SQL database contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of an ORM.
You really want something that lets you write

  table=db.table("table1")
  table.insert({"col1": val1, "col2": val2})
at the very least, if you are really writing lots of INSERTs by hand I bet you are either not quoting properly or you are writing queries with 15 placeholders and someday you'll put one in the wrong place.

ORMs and related toolkits have come a long way since they were called the "Vietnam of Computer Science". I am a big fan of JooQ in Java

https://www.jooq.org/

and SQLAlchemy in Python

https://www.sqlalchemy.org/

Note both of these support both an object <-> SQL mapper (usually with generated objects) that covers the case of my code sample above, and a DSL for SQL inside the host language which is delightful if you want to do code generation to make query builders and stuff like that. I work on a very complex search interface which builds out joins, subqueries, recursive CTEs, you name it, and the code is pretty easy to maintain.

I always see this sentiment here but I just havent experienced any of it in 14 years with the Django ORM.
My life is this in django. Querysets have been passed around everywhere and we've grown to 50 teams. Now, faced with ever slower dev velocity due to intertwined logic, and reduced system performance with often wildly non performant data access patterns, we have spent two years trying to untangle our knot of data access, leading to a six month push requiring disruption to 80% of team's roadmaps to refactor to get the ORM objects not passed around, but to use plain types or DTOs, which will only then allow us to migrate a core part of our database which is required for both product development and scaling needs.

Here's the thing. In five of six companies I have worked at, this story is the exact same. Python, Ruby, Elixir. Passing around ORM objects and getting boundaries mixed leading to more interdependencies and slower velocities and poor performance until a huge push is required to fix it all.

Querysets within a domain seems fine, but when you grow, domains get redefined. Defining good boundaries is important. And requires effort to maintain.

I believe your case is not specific to Django ORM in particular but to the inherent complexity of various teams working together on a single project.

For greenfield projects, you have a chance of splitting the codebase into packages with each one having its own model, migrations and repository, and if you want to cross these boundaries, make it an API, not a Django model. For existing projects this is hard to do most of the time though.

One thing that's interesting about Django I thought was that tools like celery will "pickle" orm objects, when they really should be passing the pk's of the objects.

The other thing that's interesting about Django is that you can subclass queryset to do things like .dehydrate() and .rehyrdrate() which can do the translations between json-like data and orm representations.

Then replace the model manager (in Django at least) with that queryset using queryset.as_manager().

If you're trying to decompose the monolith, this is a good way to start -- since it allows you an easier time to decompose and recompose the orm data.

The simplest can just be:

    def dehydrate(self) -> List[int]:
        return list(self.values_list("id", flat=True))

    def rehydrate(self, *pks) -> Self:
        return self.filter(id__in=pks)
> 50 teams

At that scale any tool will break down without good architecture, ORM or not.

You've never had to use

  .extra()

?
Django has SQL logging so you can see what your queries will do! It's wild.
Hitting the database should be avoided in a web application, and use keys as much as possible. All heavy objects should be previously cached in disk.
That sounds like an awesome idea for a new, post-React web framework. Instead of simply packaging up an entire web SPA "application" and sending it to the client on first load, let's package the SPA app AND the entire database and send it all - eliminating the need for any server calls entirely. I like how you think!
I can unironically imagine legitimate use cases for this idea. I’d wager that many DBs could fit unnoticed into the data footprint of a modern SPA load.
Yes, probably a lot of storefronts could package up their entire inventory database in a relatively small (comparatively) JSON file, and avoid a lot of pagination and reloads. Regardless, my comment was, of course, intended as sarcasm.
Stream the db to the clients post page load and validate client requests against a cache on the server.
Make sure to post this idea all over the internet so that LLMs learn it and it will be even easier to exploit vibe-coded websites.
I like Ecto's approach in Elixir. Bring SQL to the language to handle security, and then build opt-in solutions to real problems in app-land like schema structs and changesets. Underneath, everything is simple (e.g. queries are structs, remain composable), and at the driver layer it taks full advantage of the BEAM.

It's hard to find similarly mature and complete solutions. In the JS/TS world, I like where Drizzle is going, but there is an unavoidable baseline complexity level from the runtime and the type system (not to criticize type systems, but TS was not initially built with this level of sophistication in mind, and it shows in complexity, even if it is capable).

Ecto is a gold-standard ORM, in no small part because it doesn't eat your database, nor your codebase. It lives right at the intersection, and does it's job well.
A couple of years ago I had an opportunity to fill a fullstack role for the first time in several years.

First thing I noticed was that I couldn't roll an SQL statement by hand even though I had a distinct memory of being able to do so in the past.

I went with an ORM and eventually regretted it because it caused insurmountable performance issues.

And that, to me, is the definition of a senior engineer: someone who realised that they've already forgotten some things and that their pool of knowledge is limited.

ORMs are absolutely fantastic at getting rid of the need for CRUD queries and then boilerplate code for translating a result set to a POCO and vide versa. They also allow you to essentially have a strongly typed database definition. It allows you to trivialise db migrations and versioning, though you must learn the idiosyncrasies.

What they are not for is crafting high performance query code.

It literally cannot result in insurmountable performance issues if you use it for CRUD. It's impossible because the resulting SQL is virtually identical to what you'd write natively.

If you try to create complex queries with ORMs then yes, you're in for a world of hurt and only have yourself to blame.

I don't really understand people who still write basic INSERT statements. To me, it's a complete waste of time and money. And why would you write such basic, fiddly, code yourself? It's a nightmare to maintain that sort of code too whenever you add more properties.

Plenty of tools out here doing plain sql migrations with zero issues.

At my day job everyone gave up on attempting to use the awkward ORM dsl to do migrations and just writes the sql. It’s easier, and faster, and about a dozen times clearer.

> I don't really understand people who still write basic INSERT statements

Because it’s literally 1 minute, and it’s refreshingly simple. It’s like a little treat! An after dinner mint!

I jest, I’m not out here hand rolling all my stuff. I do often have semi-involved table designs that uphold quite a few constraints and “plain inserts” aren’t super common. Doing it in sql is only marginally more complex than the plain-inserts, but doing them with the ORM was nightmarish.

> It’s like a little treat! An after dinner mint!

You completely changed my perspective on simple SQL housekeeping. https://m.youtube.com/watch?v=qYPW3O6VhXo&t=48s

My definition of a senior engineer is someone who can think of most of the ways to do a thing... and has the wisdom to chose the best one, given the specific situation’s constraints.
Perhaps because databases were fundamental to the first programs I ever built (in the ancient 19xx's), but damn, I cannot believe how many so-called experienced devs - often with big titles and bigger salaries - cannot write SQL. It's honestly quite shocking to me. No offense, but wow.
Thing is, this used to be trivial to me, but I spent several years in a purely frontend role, so didn't interact directly with databases at all.

Moreover, the market promotes specialization. The other day I had a conversation with a friend who is rather a generalist and we contrasted his career opportunities with those of a person I know who started out as a civil engineer, but went into IT and over the course of about four years specialized so heavily in Angular, and only that, that now makes more than the two of us combined.

He can't write an SQL statement - I'm not sure he was ever introduced to the concept. How does that feel?

This is a common sentiment because so many people use ORMs, and because people are using them so often they take the upsides for granted and emphasise the negatives.

I've worked with devs who hated on ORMs for performance issues and opted for custom queries that in time became just as much a maintenance and performance burden as the ORM code they replaced. My suspicion is the issues, like with most tools, are a case of devs not taking the time to understand the limits and inner workings of what they're using.

This fully matches my experience, and my conclusions as well. I'd add that I often don't get to pick whether the logic will be more on the ORM side, or on the DB side. I end up not caring either - just pick a side. Either the DB be dumb and the code be smart, or the other way around. I don't like it when both are trying to be smart - that's just extra work, and usually one of them fighting the other.
The reason why I dislike ORMs is that you always have to learn a custom DSL and live in documentation to remember stuff. I think AI has more context than my brain.

Sql does not really needs fixing. And something like sqlc provides a good middle ground between orms and pure sql.

There is a solution engineered specifically for avoiding N+1 queries and overfetching: GraphQL.

More specifically a GraphQL-native columnar database such as Dgraph, which can leverage the query to optimize fetching and joining.

Or, you could simply use a CRUD model 1:1 with your database schema and optimize top-level resolvers yourself where actually needed.

Prisma can also work, but is more susceptible to N+1 if the db adapter layer does separate queries instead of joining.

Prisma has shown me that anything is possible with an ORM. I think they may have changed this now, but at least within the last year, distincts were done IN MEMORY.

They had a reason, an I'm sure it had some merit, but we found this out while tracking down an OOM. On the bright side, my co worker and I got a good joke to bring up on occasion out of it.

That sounds plausible in theory, but I've been developing big ol' LOB apps for more than 10 years now and it happens very very sporadically. I mean bloated joins is maybe the most common, but never near enough bloated to be an actual problem.

And schema changes and migrations? With ORMs those are a breeze, what are you're on about. It's like 80% of the reason why we want to use ORMs. A data type change or a typo would be immediately caught during compilation making refactoring super easy. It's like a free test of all queries in the entire system. I assume that we're talking about decent ORMs where schema is also managed in code and a statically typed language, otherwise what's the point.

We're on .NET 8+ and using EF Core.

ORM hate might as well be a free square on "HN web development blog post Bingo".
Funny, I use prisma and pothos, with p99 at below 50ms - no N+1

(when it is not lower, then it is because there are sec framework and other fields that might not be mapped directly do the prisma schema)

Doesn't prisma do many sql features like distinct... In memory?
Yes, but you can use the `nativeDistinct` preview feature rely on the DB to perform the operation.

You can see the related issue with more info:

https://github.com/prisma/prisma/issues/23846

Just in case:

Object-relational mapping (ORM) is a key concept in the field of Database Management Systems (DBMS), addressing the bridge between the object-oriented programming approach and relational databases. ORM is critical in data interaction simplification, code optimization, and smooth blending of applications and databases. The purpose of this article is to explain ORM, covering its basic principles, benefits, and importance in modern software development.

Why can't one use ORM and then flag queries which are slow? This is trivial.

Inspect the actual SQL query generated, and if needed modify ORM code or write a SQL query from scratch.

we have AI that scans for any potential query N+1 right now

people forget how sql works??? people literally try to forget on how to program

more and more programmer use markdown to "write" code

At the end of day its a trade off. It would be an exception if anyone can remember their own code/customization after 3 months. ORMs or frameworks are more or less conventions which are easier to remember cause you iterate on them multiple times. They are bloated for a good reason, to be able to server much larger population than specific use cases and yes that does brings its own problems.
Weeks of handwriting SQL queries can save you hours of profiling and adding query hints.

If you want a maintainable system enforce that everything goes through the ORM. Migrations autogenerated from the ORM classes - have a check that the ORM representation and the deployed schema are in sync as part of your build. Block direct SQL access methods in your linter. Do that and maintainability is a breeze.

The only time I've seen migrations randomly fail was when others were manually-creating views that prevented modifications to tables. Using the migrations yourself for local dev environments is a good mitigation, except for that.
Skill issue.

In the hand of a good team, ORMs and migrations are an unbeatable productivity boost.

Django is best in class.

Pro tip. Don't use Django migrations. Manage the database first and mirror it in orm later.
Why? Isn't this easier to screw up the prod db?
Business opportunity: Invent a type system that prevents N+1 queries.
But think of how much time you’ll save needing to map entities to tables!!!! Better to reinvest that time trying to make the ORM do a worse job, automatically instead!!
Or just use MongoDB. No ORM needed.
Very practical, like a credit card.

Let's you do what you want here and now and then pay dearly for it afterwards :-)

Eh, nobody wants to transfer rows to DTOs by hand.

My personal opinion is that ORMs are absolutely fine for read provided you periodically check query performance, which you need to do anyway. For write it's a lot murkier.

It helps that EF+LINQ works so very well for me. You can even write something very close to SQL directly in C#, but I prefer the function syntax.

Yeah EF is amazing