Hacker News new | ask | show | jobs
by ozataman 4621 days ago
Application design for me almost always begins with data and data structures. Whether my database has an explicit schema or not, I always have one in mind, documented or otherwise reified in the table-data structures I have in my code. I just don't get why people would want a schema-free database that is in almost every way inferior to the rock-solid power beast that is Postgres. Just use a library with proper migration support so you can propagate changes to your schema rapidly during development. You'll thank us later when you learn a little bit of SQL and start analyzing your data, running circles around the no-sql guys.

Cassandra et. al. are completely different, in that you don't use them because they are more fun to use. You use them despite their awkward, low-level interfaces because you're going to dump billions of data cells into your database from day one with no end in sight and want all the easy scaling/availability features provided.

1 comments

Do you always start with the perfect data structure? I find myself adding, removing, and restructuring schema often. Just as you think it's silly to use an "inferior" db during prototyping, I think it's silly to have to jump through hoops -- even minor ones -- while I'm just trying to experiment with a new technology or play with a concept, product design, or pet project. 99 times out of 100, I don't care if my project survives the weekend. Let me use that database I want to use!

> You'll thank us later when you learn a little bit of SQL and start analyzing your data, running circles around the no-sql guys.

That's a little condescending... do you know a single mongo user who doesn't have experience with SQL? Plus, I love the fact that I can literally run javascript against my database. Good for production? Certainly not. But that doesn't mean it's not fun or useful.

Not every project requires such rigor. If that's how you enjoy development, that's great! Very few of my projects put the db layer to the test, and so I'm happy with the balance that mongo gives me. I use it in about 4/5 of my experiments and side projects.

> Do you always start with the perfect data structure? I find myself adding, removing, and restructuring schema often.

Which is why it doesn't make any sense to claim that using MongoDB somehow eliminates needing to migrate your data as it evolves.

That is easiest [cough, imho] solved with adding a version number to stored records. Since data is not in much of a normal form and there won't be that many joins, it generally is easy to handle in code.

Sometimes you have to do update of records with a certain version number.

My opinions, for the record: MongoDB is a tool with some use cases. I'm more of an SQL+Memcache guy, if possible, but not religiously if a good argument is presented (that don't sound like "let's use .*, I want another keyword on my cv").

If you make 12 schema changes in month 1 and then no schema changes for the next year, does it really make sense to keep a month's worth of data in 12 different formats and maintain code to support all of the different versions? Why not just do a simple schema change and/or data migration each time and be done with it?

And since this is supposed to aid in rapid prototyping, how does it do so? It seems to me that it does just the opposite by introducing a significant and totally unnecessary burden.

Generally I'd only have at most 2 formats at once while you converted the older records to the new format. You're right that there's no sense in keeping around a dozen versions but there are a lot of business cases for having two versions of a schema active at once. For example, if you can't bring down your application to convert everything mid-day and instead want to do an incremental conversion.
As functional_test said. Also note that this e.g. depends on how long lived your data is.

(An update routine can be run at any point with low use like Xmas, etc. This is potentially neat, depending on use statistics.)

I'm not saying this is a common thing, but the lack of joins makes the data a bit more flexible -- this can't be too much, if nothing else because then the Javascript will begin to break.

(I do think there are much more use cases for nosql than as a Memcached with more features. Where an old job used MongoDB wasn't one.)