| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by hardwaresofton 2801 days ago

Sorry, I think either my fundamental understanding is off or I wasn't clear enough in how I was imagining this happening...

If you and a majority of nodes agree on a value like an CRDT OpSet (more simplistically, just agreeing on the state of the log), how does that not guarantee agreement and serializability? It is impossible from that point on to have a another majority of nodes have some other view of what happened. Consensus algorithms are

One copy serializability is exactly what would be achieved by having a read and write quorum[0][1]. It intuitively makes sense to me (and maybe my intuition is wrong), but if you talk to a majority of nodes and ask "this is what we all have, right?" before every write and every read, you've got a guaranteed consistent (of course, progress isn't guaranteed if partitions/nodes die, etc) state.

AFAIK Quorums are the state of the art (and arguably the only relatively efficient option) as far as achieving serializability in a distributed system...

[0]: https://en.wikipedia.org/wiki/Quorum_(distributed_computing)...

[1]: https://arxiv.org/pdf/1406.7423.pdf

1 comments

evanweaver 2800 days ago

> If you talk to a majority of nodes and ask "this is what we all have, right?"

You cannot know this, because the transaction replication is racy and not atomic--it may have applied to only one node while you are doing your read. Whether you see it or don't is luck. So you can have the following scenario (and in practice you will):

- TX commit begins

- TX replicated to node A

- Read from coordinator A' begins

- Read sees replica A (has tx) and replica B (does not)

- Read assumes A wins because of some kind of vector clock in the data value (choosing the older value doesn't make things better, just in case you are wondering)

- Read from coordinator B' begins

- Read sees B (no TX) and C (no TX)

- Read completes with stale value--serializability violation has occurred

- TX finishes replicating to B and C

This leaves aside write skew, torn transactions due to partitions, and all kinds of other problems.

link

hardwaresofton 2800 days ago

I'm super confused -- what you're describing isn't a quorum read/write scenario -- what do you mean by "a wins"? Also where is the prepare phase for 2 phrase commit/any consensus algo? Replica A shouldn't be reporting an unacknowledged transaction to coordinators -- the write hasn't been quorum acknowledged. TX is not considered committed until it reaches a majority of nodes. You are right that if you have a network partition you're in trouble, but that ends in lack of progress, not loss of serializability.

We must be talking about different things because I can't find any literature that has reached the conclusion that serializability is impossible in distributed transactions? Can you point me to that?

Also, do you have any thoughts on the literature[0] that contradicts what you're saying? I'm not an expert but unless I'm not misreading english serializability is possible with quorum reads and writes.

> In a new algorithm for maintaining replicated data, every copy of a replicated file is assigned some number of votes. Every transaction collects a read quorum of rvotes to read a file, and a write quorum of wvotes to write a file, such that r+w is greater than the total number of votes assigned to the file. This ensures that there is a non-null intersection between every read quorum and every write quorum. Version numbers make it possible to determine which copies are current. The reliability and performance characteristics of a replicated file can be controlled by appropriately choosing r, w, and the file's voting configuration. The algorithm guarantees serial consistency, admits temporary copies in a natural way by the introduction of copies with no votes, and has been implemented in the context of an application system called Violet.

Has this paper been refuted? In addition to this there's literally the whole section on distributed serializability[1].

[0]: https://dl.acm.org/citation.cfm?doid=800215.806583

[1]: https://en.wikipedia.org/wiki/Serializability#Distributed_se...

link

evanweaver 2793 days ago

> what you're describing isn't a quorum read/write scenario -- what do you mean by "a wins"? Also where is the prepare phase for 2 phrase commit/any consensus algo?

There is no consensus; that requires a leader system. The paper you link appears to require a multi-phase lock; the quorum itself does not guarantee serializability. Explicit preparation via quorum can guarantee serializability (but not strict serializability, I don't think), but cleanup of locks is a big performance problem in practice.

> Replica A shouldn't be reporting an unacknowledged transaction to coordinators -- the write hasn't been quorum acknowledged.

Acknowledged by who? The replicas can't block on the other replicas; they just tell the coordinator when they applied the write.

This is worth a blog post.

link