Hacker News new | ask | show | jobs
by richieartoul 1410 days ago
I’m not 100% sure, but my reading of the CockroachDB Jepsen test: https://jepsen.io/analyses/cockroachdb-beta-20160829 indicates that it does meet those two requirements, but that it still is not globally strictly serializable due to the presence of an anomaly they call “causal reversal” where: “transactions on disjoint records are visible out of order.”

If you read their Jepsen report and blog posts carefully, the anomaly they suffer from is not caused by a read transaction that started after previous write transactions had completed, but by a read transaction running concurrently with two other write transactions which both commit while the read is still running, and the read sees the writes in the wrong order. The definition you’ve provided above does not cover this case (I think) but its still a violation of strict serializability which makes me think your definition is too loose.

Correct me if I’m wrong though, at this level things get quite confusing.

1 comments

This stuff is very subtle, but you have to consider all three transactions when applying the guarantees.

If w1 and w2 overlap their execution then any serializable execution is strict serializable, so w1 must have finished before w2 began. In this case (2) applies and w1 must have the effect of executing before w2, to all observers, including r.

If w1 and w2 are commutative then it doesn’t matter in what order they execute as they’re equivalent. If not, then at some point their dependency graphs intersect and will be/have been ordered wrt each other by the same restrictions.

edit: in the example given in the Jepsen report, with Accord r would have to execute either before w1; after w1 and before w2; or after them both. This is because one of the following must occur:

1) r takes a dependency on both w1 and w2 and executes after them both;

2) r takes a dependency on only w1; w2 takes a dependency on r, so w2 executes after r, and r executes after w1

3) w1 and w2 both take a dependency on r, so that r executes before either

Isn’t that ignoring the point that the read is running concurrently with w1 and w2? Your comment about “have the effect” only applies to transactions beginning after a write has completed, but the anomaly cockroach suffers from is for a read running concurrently with two writes with no overlapping execution. In that scenario they fail to be strictly serializable, but so would any system meeting only the two criteria you outlined I think.

Edit: Ah ok I see you edited it. Ok that makes sense if it works.

yeah, to clarify also textually the "has the effect of" applies to all witnesses, including those that are concurrent. So if r executes after w1 or w2 then r must see that w1 applies before w2, since w1 "has the effect of" applying before w2, whether or not r is concurrent with them.
Why is it not possible for r to take a dependency on w2 and no dependency on w1?

w2 may hit different nodes than w1 in a sharded database?

Or is the leader-Paxos approach in Accord operate over all nodes in the database regardless of the shards the keys fall into? (I guess that is the only explanation.) But then scalability of these global transactions would be limited, no?

Thank you for your detailed explanations.

So this is where it can start to get a little complicated, but without going into the nitty gritty, taking the example precisely as given in the Jepsen report:

If r arrives at C1 before w1, w1 must witness it and take a dependency on its execution. If it arrives at C2 after w2, then r takes a dependency on w2’s execution. In this case, w1 will not execute until r does, and r will not execute until w2 does, and so it is simply not the case that w1 happened first - w1’s commit will not have been acknowledged to the client yet, and the actual order of execution will end up w2,r,w1, which is entirely correct to all observers since w1 and w2 will now overlap their execution.

> operate over all nodes in the database regardless of the shards the keys fall into

Nope, it’s all entirely independent for each shard. The (implied) dependency graph is what ties shards together, so only when they interact.

Thank you for the explanation.

I get it now. It is all thanks to separating the operation as 1. agreeing on WAL, 2. executing after waiting all WAL agreements stabilizing.

There is a separation between the two. This can introduce waits (which could be theoretically unbounded for EPaxos but Tempo and Accord bounds this).

Maybe in some sense this is in the same spirit of Calvin's serializing WAL first and executing it later. But Accord avoids a single entity doing the serialization and achieves it via a leaderless Paxos approach which uses dependency graphs for WAL serialization, and later using another protocol for WAL execution after settlement.

> Maybe in some sense this is in the same spirit of Calvin's serializing WAL first and executing it later

Yes, it is very similar in effect, with the absence of a global log conferring some important differences with pluses and minuses. I expect to offer interactive transactions eventually, for instance, which may or may not be difficult for Calvin, and faults on a “log shard” are isolated with Accord, but have global impact with Calvin. But snapshot queries are much easier with Calvin.

How do you break dependency cycles across shards (i.e. deadlock detection)?
There aren’t any dependency cycles, as dependencies must take a strictly lower logical timestamp. Dependency resolution is essentially two step: first every transaction that MIGHT take a lower timestamp is adopted as a dependency, then once those transactions have fixed their timestamp those that take a higher timestamp are simply removed from the dependency set.

Once we offer interactive transactions we’ll have to detect cycles, but that should be relatively straightforward since execution dependencies are managed explicitly.