Hacker News new | ask | show | jobs
by irfansharif 2418 days ago
> Though I understand it differently, as reader just waits until all writes and "txn COMMITTED" record arrive at it's node.

Actually, welp, I believe this is closer to the truth in implementation. But consider how read performance compares before and after introduction of Parallel Commits. Before, when readers happen over extant 2PC "prepare" phase markers, they would still have to wait for txn resolution (on the coordinator node of the other txn, or on the node the intent was seen). They simply continue doing the same in Parallel Commits, there's no extra latency added to the read path (except when there is failure, but even then as soon as the earlier txn is recovered, future readers no longer get stuck).

1 comments

>Before, when readers happen over extant 2PC "prepare" phase markers, they would still have to wait for txn resolution

Again, I understand it differently, but maybe I'm wrong. A reader upon encountering unresolved intent looks up corresponding transaction record. Before Parallel Commits, it's either marked COMMITTED or PENDING. If it's PENDING, reader just ignores it and skips its data, since they use MVCC. There's no waiting here.

Now with parallel commits, transaction record can also be marked STAGING, in which case the reader cannot determine if it's commited without additional work and/or waiting (the author doesn't go much into details).

> If it's PENDING, reader just ignores it and skips its data, since they use MVCC. There's no waiting here.

I think this is where the confusion is coming from. You're correct that a read can simply ignore writes, even pending ones, at higher timestamps due to MVCC. This improves transaction concurrency.

However, if a read finds a provisional write (an intent) at a lower timestamp, it can't just ignore it. It needs to know whether to observe the write or not. So it looks up the write's transaction record and may have to wait. If the write transaction is not finalized then it needs to either wait on the transaction to finish or force the transaction's timestamp up above its read timestamp. This is true regardless of parallel commits or not.

What parallel commits gets us is a faster path to transaction commit, as irfansharif pointed out below. So the write can not only be committed faster with parallel commits, but it can also be resolved faster to get out of other reads' ways. In that way, it improves both the synchronous latency profile and the contention footprint of transactions, assuming no coordinator failures.

Thank you for clarification.