| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by abadid 2743 days ago

(1) I'm not sure I fully understand what you are referring to wrt the "acks" but either way, one major difference is that the acks don't have to be made durable in the alg. described by the post. Also, the other key thing to note is that alg. described by the post does not suffer from the blocking or cloggage problems of 2PC.

(2) The post was already too long, so I didn't cover all the cases. The code rewrite algorithm described in the post is indeed O(n^2) if every shard has the possibility of a state-based abort. However, the number of shards that have possibilities of state-based aborts is known before the transaction begins, and since the same values are being sent everywhere, you can always use standard network broadcast techniques to bring the complexity back to O(n).

(3) Garbage collection can work via a high water mark. You keep track of the highest transaction number for which all transactions less than this number have been completed, and can garbage collect values for those versions.

2 comments

magicalhippo 2743 days ago

> However, the number of shards that have possibilities of state-based aborts is known before the transaction begins, and since the same values are being sent everywhere, you can always use standard network broadcast techniques to bring the complexity back to O(n)

Ah, good point. That assumes the workers are within broadcast range, but you'll want that anyway for speed. This means a worker will have to push values to other workers, so they'll maintain a queue of these then?

Anyway, thanks for the responses. Like I said, not my field at all, but really fun stuff to ponder.

link

magicalhippo 2743 days ago

The point with the acks is, as far as I can see, that the client initiating the transaction has to 1) know when each worker has received the transaction request, and 2) when the first worker has completed it and what the result of the transaction is (failed due to constraints or success).

Point 1 is in case a worker dies before it has durably stored the transaction request.

Point 2 is so that the client can move on.

link