| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by takeda 3711 days ago

Both MongoDB and MySQL were started by people who had no background in writing databases. They were blazingly fast compared to existing RDBMS, but once they started adding features that actually provided things that people assumed a database should have (primarily in area of consistency) they no longer look that great.

What makes MySQL better than MongoDB is that is older and closer to what a database should do. They still have their warts and some of the warts might not be removed, since it would break compatibility.

At this point MongoDB does not have anything going for them. Its key benefit was their speed, but as it turned out that was because data was stored mostly in RAM if Mongo crashed or there was a power loss then you most likely would lose significant (possibly all) of your data. Since then they fixed that and make it more reliable, it is worse in performance than a relational database [1] and it also doesn't scale well [2]

Essentially NoSQL databases were designed to be simple and without relational features in exchange for scalability and performance. Mongo you get neither. Mongo doesn't even try to benefit from the CAP theorem. It's neither always consistent nor always available [3].

You generally should always use a relational database, because in majority of cases you do have some schema and you expect data to be consistent. NoSQL databases (especially ones that are AP in CAP) are generally good for specialized use cases, things that have no relations and are acceptable to be wrong or missing occasionally. For example storing logs, or user sessions etc.

Lastly, regarding question about performance. You need to understand your data and what you are doing. At my previous job there was a database called region. It was intended for a task such as looking up latitude/longitude -> ZIP code, and also IP -> ZIP code.

That database was (and probably still is) running on 3 beefy machines running Mongo and contained data was about 13GB. One time I wanted to see how it would work in a relational database. So I loaded the dataset to a Postgres database and installed PostGIS and ip4r extensions. And you know what? The same data took only 600MB there and all queries took sub millisecond on smallest VM.

How come? Mongo does not understand IP addresses, so what they did is they converted an IP into a number and stored it as a 64bit integer in Mongo. Postgres on the other hand was simply storing IP ranges, and with an GiST index.

Why I'm telling you this? I think it is important to know that RDBMS databases have been used for a long time, and many problems were already solved there if you're having some performance problem chances are someone else did have it as well before you (in this case someone wrote an extension providing a new data type)

[1] http://www.enterprisedb.com/postgres-plus-edb-blog/marc-lins... [2] http://www.datastax.com/wp-content/themes/datastax-2014-08/f... [3] https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...