Hacker News new | ask | show | jobs
by mattiemass 3784 days ago
Interesting stuff. In all my distributed systems work so far, I've assumed that a distributed lock is a thing to avoid. I really should take another look at them, just as a tool to have at my disposal.
4 comments

If you assume that your distributed lock gives you transactional guarantees that you are the only lock holder then you are making a mistake. If, however, you can tolerate small overlaps in lock holders you are fine and this helps with numerous distributed algorithms. Further, using other facilities such as fences can make it even more secure. Another feature of ZooKeeper is write-with-version. You could obtain a lock (using Apache Curator - note I wrote this), then do a write-with-version to achieve 100% certainty that you are the only writer.
BTW - I wrote a Curator Tech Note about this a while ago: https://cwiki.apache.org/confluence/display/CURATOR/TN10
Like anything, it depends on what you're using it for. I wouldn't put a distributed lock into some massively high volume request path, or where absolute availability is required, but it's perfectly fine for some scenarios.

As Martin points out though, many (most?) distributed lock implementations are or can be broken in various ways. He hints at one of the fundamental problems - consensus. Many distributed lock implementations fail simply because they cannot achieve reliable consensus, or don't even try to.

That said, distributed locks can be safe and handle reasonably high throughput. The Atomix distributed lock is one example:

http://atomix.io/atomix/

http://atomix.io/atomix/docs/coordination/#distributedlock

Since consensus requires quorum and quorum requires availability, there is a risk that your ability to obtain a lock or learn about lock related events could be effected by availability, but at least in the case of Atomix, the system is fairly resilient with auto-failovers to passive or inactive nodes as needed (as compared to, say, a ZooKeeper based lock).

Thanks for posting about Atomix; I had not heard about it before. This seems to be based on Copycat, discussed previously on HN here: https://news.ycombinator.com/item?id=8180360
It is indeed built on Copycat. We're updating the docs right now ahead of an RC, this week or next.
Oracle has a distributed transaction system that'll work cross-country, but I can't find the details. It's basically a complete rip-off of the OpenVMS implementation from the late 70s (or whenever all those VAX/VMSstations came out - post PDP), no surprises there, and it costs an insane amount just for that functionality (separate from RAC), but it works really well. I'm trying to find the docs since it's been more than half a decade since I used it but [1] is the closest I can get (and that's certainly not it, though RAC does it's job quite well also and that page is worth a read if only for exposing yourself to time-tested transactional models.) It might be HA-NFS over ACFS but that doesn't seem quite right to me. Anyways, read section 5 of this[2] Oracle doc, to have a master-class on maintaining database integrity across the datacenter or across the country. I've used a litany of DBs and I think I wouldn't pick anything made in the last 10 years for my primary store other than Datomic (which I've beaten into the ground as hard as I could and it took my abuse) and or maybeeee VoltDB (if only because Stonebraker is DJB-level smart).

[1] http://docs.oracle.com/cd/B28359_01/server.111/b28318/consis...

I mean, it's definitely a thing you want to avoid whenever possible, because it's a strict point of slowdown in a distributed application.
Perf for sure, but I was typically more concerned about availability.