| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by moe 5126 days ago

[2] When you look at [1] you'll notice that these exact problems are still prevalent.

I'd in fact be curious how exactly did you work around the sharding issues at 4square?

Remember I replied to someone who claimed it takes "no engineering effort" to scale MongoDB. That's not only obviously false, but last time I tried the sharding was so brittle that recommending it as a scaling path would border on malice.

I ran a few rather simple tests for common scenarios; high write-load, flapping mongod, kill -9/rejoin, temporary network partition, deliberate memory starvation. MongoDB failed terribly in every single one of them. The behavior would range from the cluster becoming unresponsive (temporary or terminally), over data-corruption (collection disappears or inaccessible with error), silent data-corruption (inconsistent query-results), to severe cluster imbalance, to crashes (segfault, "shard version not ok" and a whole range of other messages).

I didn't try very hard, it was terribly easy to trigger pathological behavior.

My take-home was that I most certainly don't want to be around when a large MongoDB deployment fails in production.

As such I'm a little disconcerted every time the Mongo scalability myth is reinstated on HN, usually by people who haven't even tried it beyond a single instance on their laptop.

2 comments

jshen 5124 days ago

"I ran a few rather simple tests for common scenarios; high write-load, flapping mongod, kill -9/rejoin, temporary network partition, deliberate memory starvation. MongoDB failed terribly in every single one of them. "

What other databases did you go to the same lengths to make fail which handled them gracefully?

link

leothekim 5125 days ago

I don't think what I said was bullshit. So you wrote tests to make mongo fail, and you've seen cases where people run into problems with it. That still doesn't disprove my point. With postgres, you roll your own sharding. With mongo, you don't have to.

link

moe 5125 days ago

Sure, makes sense. If you're happy with your deployment randomly failing.

link

leothekim 5125 days ago

/troll.

link

moe 5125 days ago

It seems you fall squarely into the bucket of 'people who haven't even tried' (and some other unfavorable buckets, but I'll leave that to your older self to judge).

link

leothekim 5118 days ago

Wow, I just saw your awesome response! FWIW, I too work at foursquare and sit next to Neil (nsanch). Feel free to verify by asking him!

I reiterate: /troll

link