Hacker News new | ask | show | jobs
by erikwitt 3416 days ago
Although MongoDB has its limits regarding consistency, there are things that we do differently from parse to ensure consistency:

- The first thing is that we do not read from slaves. Replicas are only used for fault tolerance as it's the default in MongoDB. This means you always get the newest object version from the server.

- Our default update operation compares object versions and rejects writes if the object was updated concurrently. This ensures consistency for single object read-modify-write use cases. There is also an operation called "optimisticSave" the retries your updates until no concurrent modification comes in the way. This approach is called optimistic concurrency control. With forced updates, however, you can override whatever version is in the database, in this case, the last writer wins.

- We also expose MongoDBs partial update operators to our clients (https://docs.mongodb.com/manual/reference/operator/update/). With this, one can increase counters, push items into arrays, add elements into sets and let MongoDB handle concurrent updates. With these operations, we do not have to rely on optimistic retries.

- The last and most powerful tool we are currently working on is a mechanism for full ACID transactions on top of MongoDB. I've been working on this at Baqend for the last two years and also wrote my master thesis on it. It works roughly like this:

   1. The client starts the transaction, reads objects from the server (or even from the cache using our Bloom filter strategy) and buffers all writes locally.

   2. On transaction commit all read version and updated objects are sent to the server to be validated.

   3. The server validates the transaction and ensures the isolation using optimistic concurrency control. In essence, if there were concurrent updates, the transaction is aborted.

   4. Once the transaction is successfully validated, updates are persisted in MongoDB.
There is a lot more in the details to ensure isolation, recovery as well as scalability and also to make it work with our caching infrastructure. The implementation is currently in our testing stage. If you are interested in the technical details, this is my master thesis: https://vsis-www.informatik.uni-hamburg.de/getDoc.php/thesis...