Hacker News new | ask | show | jobs
by ruw1090 3670 days ago
While I love to hate on MongoDB as much as the next guy, this behavior is consistent with read-committed isolation. You'd have to be using Serializable isolation in an RDBMS to avoid this anomaly.
6 comments

I think this is incorrect, but it's not as simple as the other replies are making it out to be.

Under read-committed isolation, within a single operation, you must not be able to see inconsistent data. So if you do "SELECT <star>" on a table while rows are being updated, you're guaranteed to always see either the old value or the new value. But if you do two separate statements, "SELECT <star> WHERE value='new'" and "SELECT <star> WHERE value='old'" in the same transaction, you may not see the row because its value could have changed. Serializable isolation prevents this case, typically by holding locks until the transaction commits.

It gets messy because the ANSI SQL isolation levels are of course defined in terms of SQL statements, which don't map perfectly to the operations that a MongoDB client can do. Mongo apparently treats an "index scan" as a sequence of many individual operations, not as a single read. So you could argue that it technically obeys read-committed isolation, but it definitely violates the spirit.

This is worse than read-committed because you're not even seeing the old state of the document. If an update moves a document around within the results, and it ends up in the portion you've already read, you just don't see it at all.
The article suggests that tuples being moved to different storage locations can cause them to not show up in a table scan.

No such thing can happen in a sane RDBMS, no matter the transaction isolation level.

In postgres (and a fair number of other databases) you'll not see that anomaly, even with read committed. Usually you'll want to have stricter semantics for an individual query, than for the whole transaction.
With read-committed you see the old state.
Quoting from the very first paragraph of the blog post:

> Specifically, if a document is updated while the query is running, MongoDB may not return it from the query — even if it matches both before and after the update!

How's that compatible with READ COMMITTED isolation level?