| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjb 4367 days ago

The Sinfonia paper (http://www.sosp2007.org/papers/sosp064-aguilera.pdf) is interesting, and section 4 does a better job of explaining how this 'minitransaction' model works than this blog post does.

The part of this to pay the most attention to is way that it handles node failure:

> The traditional way to avoid blocking on coordinator crashes is to use three-phase commit, but we want to avoid the extra phase. We accomplish this by blocking on participant crashes instead of coordinator crashes.

That seems sensible, and is later justified by the fact that the memory nodes can be replicated, and can be highly available in their own right. This brings a significant limitation:

> When a participant memory node crashes, the system blocks any outstanding minitransactions involving the participant until the participant recovers.

So, in Sinfonia, the memory nodes HAVE to be highly available, or you've built a system with many single points of failure instead of just one.

Sinfonia is actually a pretty clever system, and despite some interesting edges (crash recovery and log compaction), is not particularly complicated. Directly comparing its memory model to multi-Paxos, though, rings a little bit hollow for me. One of these things is a non-blocking atomic commit protocol which allows transactions across reliable nodes, the other is a consensus protocol which replicates a log across unreliable nodes. They aren't really solving the same problem.

1 comments

topher-the-geek 4367 days ago

> So, in Sinfonia, the memory nodes HAVE to be highly available, or you've built a system with many single points of failure instead of just one.

Yes, in Sinfonia that's true since an item resides on only one node. The Scalaris project augments Sinfonia by performing operations on a majority of replicas.

> the other is a consensus protocol which replicates a log across unreliable nodes

Interesting. I hadn't thought of it like that. Indeed, when you frame it that way, it doesn't seem right to compare a minitransaction to multi-Paxos. I had thought of multi-Paxos as a mechanism to serialize updates on the replicas of a key-value store. In that framing, the comparison makes more sense.

link