Hacker News new | ask | show | jobs
by functional_test 4148 days ago
This is objectively false. I used MongoDB for >3 years with 100% production uptime and a very large data set.

I get that some people have had bad experiences with Mongo. Sounds like you're one of them. Why not share your experience rather than just spread FUD?

It's worth noting that a lot of the complaints people had about Mongo were with things that were clearly documented (e.g. no confirmations on writes by default back in the day).

2 comments

Agree with @Xorlev, what's "very large" to you? If it's anything less than 1TB, then your definition of "very large" has scale issues (which is fitting if you are a MongoDB advocate I guess).

Further, what's your read / write pattern? Do you do replication? Do you have requirements around writes being visible to reads on slaves near immediately? What is your data integrity requirements? All of these play into whether MongoDB is the right fit for someone's needs.

The point remains, MongoDB is good for some use cases, but if you buy into the marketing hype of "the new DBMS standard for any team, in any industry", then you deserve the inevitable pain you will endure (that said, I do feel bad for your likely eventual replacements who have to clean up the mess from your poor choices).

I had ~3TB before I migrated to TokuMX (same semantics, but at the time much faster indices and lower disk space requirements). It was replicated twice, although I could let reads on slaves fall behind a bit.

Of course it's not good for everyone's use cases though. Nothing is. Did anyone honestly believe that? I mean, even with the hype, did anyone truly decide to not evaluate their _database_, and believe the ad copy blindly?

Ultimately, I wouldn't have chosen MongoDB if it was bad for my use case. I tried it and other things -- read the documentation, made benchmarks and test cases -- and chose it after a period of evaluation. To do anything less for your database seems irresponsible (I'm looking at you, people who were surprised about the original default write semantics because you didn't read the docs).

Tangentially, the personal attacks ("which is fitting if you are a MongoDB advocate I guess", "that said, I do feel bad for your likely eventual replacements who have to clean up the mess from your poor choices") add nothing to your argument. They make you look foolish. Also, my replacements are still using my system (quite happily), over a year after my departure.

Neither of those comments were directed at you personally, but at a generic person who would fit the criteria immediately preceding them. "those who think <1TB is a 'very large dataset'" == not understanding modern scale. "if you buy into marketing hype" == you are making poor choices. Neither of those selection criteria apply to you, so I apologize that you took them as personal attacks out of context.
What's a "very large dataset?"

We started having issues around 100GB that forced an upgrade to SSDs. That lasted us til 2TB, at which point we switched.

What else did your config have?

Did you use RAID on your HD's? How much RAM did you have? What types of CPU's and how many? Were you updating documents often or writing new ones? Did you have compound indexes built properly? What was your latency requirements? Was MongoDB able to saturate your resources? Did you read off of secondaries at all? Did you shard your DB and if so what was your shard key? Was it random? Could you use it for reads too? How much of your data set did you need to possibly use at any given time?

If you had to "upgrade to SSD's" at 100GB you either had amazingly poor provisions, were doing most everything wrong or both. Messages like yours indicate to me a poor user - not a poor database.

Our config was pretty standard for a few years ago (~2011-2013). RAID10 7200RPM disks, I believe they were Sandybridge, 32gb RAM. We did a decent number of updates/s.

10gen couldn't find fault with our setup other than our need to do updates. Which is weird, updating a database? Inconceivable. Anwyays, spinning disks apparently couldn't deal with the seeking. Our lock percentage was incredibly high, which makes sense given our "high" write load. It was during this evaluation that we found many of their "critical statistics" were non-deterministic (like the padding factor, rerun and you get all values 0.0-2.0. Probably fixed by now).

We had 1 index: primary. We did no other lookups. MongoDB managed to saturate only the IO subsystem. We did read from secondaries, but that didn't help overly much given that the write percent was high on them too.

On their recommendation we upgraded to SSDs. That dropped our lock percentages and it stayed low. We never made it to sharding because we simply stopped caring about MongoDB. I inherited it and was responsible for it, but ultimately decided to get rid of it in favor of Cassandra.

It's not a "poor user" when you follow their best practices of the time and have a poor experience. It (is|was) a poor database. The storage engine was braindead, their acquisition of WiredTiger admits that. It's at least improving, ever so slowly.

Fair enough and you're right, the storage system is awful. Switching to Wiredtiger was smart by them and something like that should have been done far in advance.

The problem with updates in MonogDB is that they are inline and flushed to the filesystem that way. So if you have documents which are growing then the old ones have to be 0'd out and a new one written, which results in a ton of wasted space and an unbelievable amount of thrashing. As you saw, ridiculous amounts of IO time and even with SSD's you get pretty bad SSD wear on your disks.

There's ways around this by pre-padding documents or using buckets if you are pushing into arrays but I can see why you didn't bother. Also, the write lock back then was system level - yikes.

Around 2011 MonogDB was incredibly awful. I will say it has improved manifold since then.

I had ~3TB and it was all on hard disks (although we had quite a bit of RAM). Ultimately I switched to TokuMX which substantially reduced my disk foot print, but I was a very happy MongoDB user.

My use cases may have helped (they were very appropriate for MongoDB, but I suppose I would have chosen something else if they weren't).