|
|
|
|
|
by dccoolgai
3601 days ago
|
|
I agree... but at this point, it's tough to see with all the ink that's been spilled on these issues for years how you could think anything else. Maybe I read HN too much, but the manifold problems with MongoDB have been widely publicized for the past 6 years... it seems pretty close to conventional wisdom that you're going to have those problems if you decide to use Mongo. |
|
I last used MongoDB seriously in 2012-2015. We had myriad operations problems including inconsistent indexing across shards (where some shards had an index created and others didn't, it was baffling), issues with the balancer not moving chunks properly, and more. Also it's just different than other DBs with its lack of transactional consistency (I think they've made progress on building this), but that's part of why it's fast.
However, the bigger problem is that document databases -- in general -- enable a kind of software development where the model sort of emerges over time, rather than being carefully designed from the beginning. Yes, it's flexible, but you pay an absolutely enormous cost down the line dealing with inconsistent documents. It's not like code where if you do something stupid, you can fix it over time with refactoring and "remodeling" -- data has mass. You can get into a situation where, with a large data set, it can take a week or more just to run the migration script required to scan an entire collection and rewrite a few billion documents into a new, better format.
There is no such thing as a "schemaless" database. That's like saying, oh sure, we just have a bunch of 1s and 0s in memory -- our data is "structureless". The question is whether the database enforces the schema, or not. And I think that in a lot of cases, it's a lot worse to have an "uncodified schema" than a rigid, but at least well-defined, one, that's consistent across the data at all points.
Sidenote: It's also occurred to me over the past few years that it's almost impossible to impose a consistent schema on a large enough dataset. If you truly are dealing with "big data" (TB/PB scale) maybe go straight to the document store of columnar because doing a migration is outright impossible, but don't be so quick to write it off for GB-scale datasets.