Hacker News new | ask | show | jobs
by hardwaresofton 4080 days ago
What you've brought up is precisely where relational databases excel -- it's a tradeoff (as anything else, you have to decide when to use one or the other). How many relationships do you have/expect? How intermingled/related is your data? If the answer is "not much" , then maybe you should go with documents, if it's high maybe you should go with RDBMS.

Also, the lines are blurred, like I previously stated: - RethinkDB does joins - Postgres has good support for storing JSON documents

1 comments

In reality, almost all interesting data has relationships. Even a two dimensional graph is expressing a relationship (and hopefully a correlation) between two sets of data.

I think whether or not there are relationships is the wrong way to consider the question of document vs relational. I think of a relational schema as a canonical way to store your data so that you can easily and efficiently project it in multiple different ways (queries).

A document schema stores your data as one possible projection of that data. Obviously you're right that if you can know for sure that one particular projection is the vast majority of queries then this is fine. The problem is that you have to be right about which projection of your data is most important, and normally before you write any code.

In my experience, for greenfield projects it's surprisingly hard to predict what kinds of projections are most important. Often what is interesting to users changes, or you didn't quite have the right projection or whatever.

To compound this, a lot of document stores often don't have good schema migration tools. Migrating schemas and data is actually a lot easier with SQL than anywhere else. Postgres can migrate the schema and the data transactionally and you can rollback if there is a problem in any of the steps. They are also type checked.

Basically, when starting a new project, you are probably better off doing a relational model of your data than anything else. Later, when you know your data access patterns, you can change to a document store and get the various benefits of them without the tradeoffs

Agree that how you plan to project the data is certainly important (SQL offers immense flexibility, assuming you're heavily normalized). But still, the point stands that you can use some document stores relationally. This makes this point a little duller.

Also, document stores' migration tools are the supported languages themselves... The only migration tool you need is a function from one document to another. RethinkDB actually supports almost arbitrary function input. Also, migrations are a little less important when it comes to document stores because a lot of the schema validation logic gets moved up one level (to the middle layer).

I strongly disagree about which one to start with. One of the biggest widely-noted benefits of a document store is that you don't have to think of your schema up front, you don't have to maintain constantly changing schema declarations, etc. Did you mix up the terms them by accident?

If you didn't mix them up, here's a concrete example of why document stores are often better to start with:

- I'm writing a chat service

- I am thinking about the message model, and decide a message has three fields, a message, sender, and timestamp.

- Three minutes later, while building the service, I realize that I also want to track the origin of the message (through some geolocation service or something), or the locale.

If I had used a relational database, I'd have to edit the schema, re-up/re-migrate the database, etc, when in a document store, I could just go on using it like the locale was there (and maybe add one line of code adding some default/null value for locale if it wasn't on objects I read from the database)