Hacker News new | ask | show | jobs
by evilswan 5360 days ago
I had thought about MongoDB, but bit worried about its global write lock! Surely that can't be a good idea!?

http://www.mongodb.org/display/DOCS/How+does+concurrency+wor...

Or does that only apply if "safe mode" is on, rather than fire-and-forget unverified writes?

2 comments

The global write lock, while important, is probably the most over-written-about and misunderstood issue. The writelock is only held while the db is actually in the middle of an write and tends to only be held for a very brief period (about 10s-100s of microseconds or less) at a time. For example, it is never held across multiple user actions and therefore is closer to a latch than a lock in common RDBMS terminology. The main cases where this caused issues for users are when some data we needed wasn't already in memory so we went to disk to fetch it (~10ms rather than 10us). One of the major improvements to 2.0 was a framework to yield the lock to allow other work while going to disk that is used in the places where this was seen most frequently in production. 2.2 will plug this in more places with the goal to never hold the lock when going to disk. This work is of course in parallel to work to move to DB, collection, and possibly extent-level locking replacing the global lock in almost all cases.

As for safe-mode, the waiting that it does is always outside of the lock. It uses a cached datastructure that is protected by a secondary mutex so that it never has to interact with the global dbMutex. If you choose to wait for replication or journalling of your writes, that will block the connection and therefore your client thread so single-threaded tests will show much worse performance with the db mostly idle. If you use more client threads or asynchronous I/O connections you should see roughly identical throughput in aggregate (see mongostat) although much higher latencies compared with not waiting.

If you have any questions about this feel free to shoot me an email. Replace the _ with an @ and add a .com to my user name.

Thanks Mathias, great advice.
Yes, the global write lock essentially means that you can't parallelize your write, though I hear that collection level locking is on the horizon. Unverified writes are generally bad for transaction data.

If you _really_ think you need 1,000 writes, only a complete in-memory DB like Redis can offer that... My suggestion would be to go with a simple set-up and consider scaling issues when the need arises - they are always a good headache to have.

Good info, thanks.

MySQL it is then. :)

Just keep in mind that MySQL doesn't scale easily. Master/Slave relationships get tricky to manage. I've only had minor experience with this, but it was enough to know to avoid it.