|
|
|
|
|
by mrkeen
221 days ago
|
|
* As others have pointed out, Paxos is leaderless. Electing a leader is a performance trick (reduce contention/retries), not a correctness trick - if you want to order your events. * EPaxos appears to relax ordering as long as the clients can declare their event-dependencies. Q1) If I withdraw from ATM 1 and someone else withdraws from ATM 2, we are independent consumers - so how do we possibly coordinate which withdrawal depends on the other? Q2) Assuming that's not a problem, how do I get the ability to replay events? If the nodes don't care about order (beyond constraints), how can I re-read events 1-100, suffer a node outage, and resume reading events 101-200 from a replacement node? |
|
def do_commands_conflict(c1): return len(write(c1) & read(c2)) > 0 or len(write(c2) & read(c1)) > 0 or len(write(c1) & write(c2)) > 0
Whenever an EPaxos node learns about a new command, it compares it to the commands that it already knows about. If it conflicts with any current commands, then it gains a dependency on them (see Figure 3, "received PreAccept"). So the commands race; the first node to learn about both of them is going to determine the dependency order [in some cases, two nodes will disagree on the order that the conflicting commands were received -- this is what the "Slow Path" is for].
The clients don't coordinate this; the EPaxos nodes choose the order. The cluster as a whole guarantees linearity. This just means that there's at least one possible ordering of client requests that would produce the observed behavior; if two clients send requests concurrently, there's no guarantee of who goes first.
(in particular, the committed dependency graph is durable, even though it's arbitrary, so in the event of a failure/restart, all of the nodes will agree on the dependency graph, which means that they'll always apply non-commuting commands in the same order)