Hacker News new | ask | show | jobs
by sph130 3714 days ago
What's your thought on mongodb - i've recently started a project that is running in prod now for a customer and I spent almost two weeks deciding between mysql and mongodb. Ultimately went with mongodb but it's not my forte. And I've been thinking that at some point the calls to the database are going to slow down. And i have no idea where to start looking to improve mongodb performance. (MEAN stack application) I'm sure I'll be doing a bit of research soon on that once i get the base feature set working.
4 comments

I've never used mongo in anger, but that doesn't matter, my answer is:

Profile the application. Find out what it's spending too much time on, and figure out how to make it do less of that.

Can't upvote this enough. You cannot reliably improve what you cannot reliably measure, and Amdahl's Law will tell you where to work.

I wish you every success in this business - if it takes off, I might try it myself. I found `oprofile` to be my preferred Linux profiler, and haven't worked out what the corresponding Windows solution is yet.

I would argue that there is literally never a reason to use MongoDB over other solutions. It looks tempting at first, but in reality it has all the downsides of any other database and a whole lot more junk heaped on top.

YMMV of course; that might just be my biased view after weeks of trying to improve performance.

What kind of research did you do where you ended up actually choosing mongodb? Serious question.
Both MongoDB and MySQL were started by people who had no background in writing databases. They were blazingly fast compared to existing RDBMS, but once they started adding features that actually provided things that people assumed a database should have (primarily in area of consistency) they no longer look that great.

What makes MySQL better than MongoDB is that is older and closer to what a database should do. They still have their warts and some of the warts might not be removed, since it would break compatibility.

At this point MongoDB does not have anything going for them. Its key benefit was their speed, but as it turned out that was because data was stored mostly in RAM if Mongo crashed or there was a power loss then you most likely would lose significant (possibly all) of your data. Since then they fixed that and make it more reliable, it is worse in performance than a relational database [1] and it also doesn't scale well [2]

Essentially NoSQL databases were designed to be simple and without relational features in exchange for scalability and performance. Mongo you get neither. Mongo doesn't even try to benefit from the CAP theorem. It's neither always consistent nor always available [3].

You generally should always use a relational database, because in majority of cases you do have some schema and you expect data to be consistent. NoSQL databases (especially ones that are AP in CAP) are generally good for specialized use cases, things that have no relations and are acceptable to be wrong or missing occasionally. For example storing logs, or user sessions etc.

Lastly, regarding question about performance. You need to understand your data and what you are doing. At my previous job there was a database called region. It was intended for a task such as looking up latitude/longitude -> ZIP code, and also IP -> ZIP code.

That database was (and probably still is) running on 3 beefy machines running Mongo and contained data was about 13GB. One time I wanted to see how it would work in a relational database. So I loaded the dataset to a Postgres database and installed PostGIS and ip4r extensions. And you know what? The same data took only 600MB there and all queries took sub millisecond on smallest VM.

How come? Mongo does not understand IP addresses, so what they did is they converted an IP into a number and stored it as a 64bit integer in Mongo. Postgres on the other hand was simply storing IP ranges, and with an GiST index.

Why I'm telling you this? I think it is important to know that RDBMS databases have been used for a long time, and many problems were already solved there if you're having some performance problem chances are someone else did have it as well before you (in this case someone wrote an extension providing a new data type)

[1] http://www.enterprisedb.com/postgres-plus-edb-blog/marc-lins... [2] http://www.datastax.com/wp-content/themes/datastax-2014-08/f... [3] https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...