| There is one thing MongoDB does spectacularly well - you can feed in arbitrary JSON and get the same JSON back out. (No need to define schemas or play any kind of system/db administrator.) Even the queries have the same "shape" as JSON, so no need for another arbitrary query language. It will eventually bite you, and bite you hard. But you'll be well into the millions of records before that happens. Developers and products below that number will have very smooth sailing. And some live there permanently. One project I worked on years ago involved a music catalogue. Did you know there are only about 20 million songs? The main problem is things get very painful as you get bigger, especially for writes. A doubling of write activity can lead to calamitous drops in performance. This is especially bizarre as the data model means they can easily have multiple concurrent writers. Heck having a lock per 2GB data file would quickly help with concurrency. They have this same "single" approach in other places. For example building an index is single threaded. I did a restore the other day and then had to wait 8 days while it rebuilt indexes. One cpu was pegged but everything else was idle! It also consumes huge amounts of space - at least double as the same data in JSON. There are known fixes https://jira.mongodb.org/browse/SERVER-164 https://jira.mongodb.org/browse/SERVER-863 (note how popular they are and how many years they have been open!) I wish they would focus on making better use of the resources available - it should be possible to max out cpu, RAM and I/O. We've ended up in the same situation as the article, figuring out where to migrate to with Cassandra being the front runner. |