Hacker News new | ask | show | jobs
by matthewaveryusa 3811 days ago
Great to see fresh ideas! One thing I don't like in the presentation is that tapir is presented as doing better than what is out there without stating the conditions. First off there's quite a bit of hand-waving when it comes to leadership bottlenecks -- please assume that sane sharding is occurring. I'm not entirely sure transactions spanning partitions is something unique to say paxos. The abort rate vs. contention seem off. looking at the paper all the nodes are in a single datacenter. I would love to see these numbers where tapir is spread across larger geographical regions. My suspicion is that at higher latency will negatively impact the abort rate more so than with a strong leader. What about poor clock synchronization conditions? What about testing with a variety of client latencies? Since the client is effectively acting as a leader in tapir, the client is, in some ways, contending with other clients and the abort rate may be correlated to client latency. I don't think high-latency clients observe this same correlation than with a strong leader. I wish more of the compromises were presented.
2 comments

The leader bottleneck will continue to exist even if there great sharding because the leaders simply process more messages than the replicas, so TAPIR will allow each shard to support more throughput.

The paper has an evaluation for multi-data-center replication in Figure 12. We assume that the clients are web servers, so they are always close to one of the replicas, but not all of them. The result we found is basically that TAPIR performs better in the multi-data center case except when the leader is in the same data center as the client. So it depends on whether you can always guarantee that the leader is in the same data center as the client.

The abort rate continues to essentially track the latency needed for commit. So, TAPIR reduces the abort rate compared to OCC because it reduces the commit latency. At very high contention, locking is likely to make slightly more progress, but no systems with strong consistency will be able to provide high performance. If you are interested in some other ways to optimize for the high contention case, take a look at our work on Claret: http://homes.cs.washington.edu/~bholt/projects/claret.html

We also tested with high clock skew. The paper notes, "with a clock skew of 50 ms, we saw less than 1% TAPIR retries." Since the clients can use the retry timestamps to sync their clocks, it only adds an extra round-trip, so it still leaves TAPIR with the same latency as a conventional system, even in cases of extremely high clock skew.

Yes, assumptions and pre-conditions would be quite useful. So much academic stuff has failed in the real-world due to mismatch between what they expected and actually occurred. Or even what its users expected and what it was built for.