Hacker News new | ask | show | jobs
by rgo 3670 days ago
Everytime I hear arguments for going back to relational databases, I remember all the scalability problems I lived through for 15 years in relational hell before switching to Mongo.

The thing about relational databases is that they do everything for you. You just lay the schema out (with ancient E-R tools maybe) load your relational data, write the queries, indexes, that's it.

The problem was scalability, or any tough performance situation really. That's when you realized RDBMSs were huge lock-ins, in the sense that they would require an enormous amount of time to figure out how to optimize queries and db parameters so that they could do that magic outer join for you. I remember queries that would take 10x more time to finish just by changing the order of tables in a FROM. I recall spending days trying different Oracle hints just to see if that would make any difference. And the SQL-way, with PK constraints and things like triggers, just made matters worse by claiming the database was actually responsible for maintaining data consistency. SQL, with its naturalish language syntax, was designed so that businessman could inquire the database directly about their business, but somehow that became a programming interface, and finally things like ORMs where invented that actually translated code into English so that a query compiler could translate that back into code. Insane!

Mongo, like most NoSQL, forces you to denormalize and do data consistency in your code, moving data logic into solid models that are tested and versioned from day one. That's the way it's supposed to be done, it sorta screams take control over your data goddammit. So, yes, there's a long way to go with Mongo or any generalistic NoSQL database really, but RDBMS seems a step back even if your data is purely relational.

4 comments

> And the SQL-way, with PK constraints and things like triggers, just made matters worse by claiming the database was actually responsible for maintaining data consistency.

I...just...I can't.

Your database should ALWAYS be responsible for maintaining consistency. Your application will likely be dead in a couple years, and if it isn't then you are going to end up having something else interfacing with the database at some point, guaranteed. The only sane place to guarantee consistency is in the database itself, otherwise every time you change constraints you are going to be updating every integration you have and hoping you didn't miss something.

My lord, what has come over developers?

Suppose you're writing a system for vacation rentals. Here are two data consistency rules: 1) only one user may have a given email address 2) a given property can't have two overlapping rentals.

Postgres can enforce both, in a way that's not subject to race conditions. 1) Is a unique constraint and 2) is an exclusion constraint.

As far as I'm aware, the only way for application code to do this would be for it to 1) lock an entire table/collection of data 2) do a query to see what data is there 3) check the new data against the existing 4) write new data 5) unlock the database.

Postgres probably has to do the same thing conceptually, but by using indexes and running the checks in C on the same machine where the data lives, it can do it very quickly.

Are you saying you have a better solution?

I've been in the opposite situation and I couldn't disagree more. But I will say this, it's always possible to take an RDBMS model and de-normalize it and use it like a NoSQL database (like reddit does, for example) but it's not possible to go the other way.
> but it's not possible to go the other way

Why not? We do exactly that. We prototype in mongo and then migrate to postgres when we're comfortable with where the app is headed.

I don't mean it's possible to use a different technology; I mean within the same technology (postgres, for example) you can use it both as a normalized relational database and/or as a de-normalized document store.
> do data consistency in your code, moving data logic into solid models that are tested and versioned from day one

So your consistency check code is more reliable that that which has been built and tested over many years, people and projects?

And your code is guaranteed to be deployed into any project that need to access that database later?