Hacker News new | ask | show | jobs
by bkeroack 3784 days ago
It's not really "fundamental". It's simply that the process that acquires the lock can fail (or be paused, or partitioned away from everything else, etc), and if it does, but then comes back later with a valid lock, bad things may happen.

The author's solution is to push serialization logic into the resource/storage layer (by checking fencing tokens). But what if the resource is itself distributed? Then it needs it's own synchronization mechanism? It's locks all the way down.

1 comments

Thinking more about it, this is a fundamental weakness of having self-policing processes, which I suppose is the OP's main point. It can be mitigated by having infinite lock TTLs, at the cost of risking system deadlock on process failure. Thank you to GPP for spurring me to think more deeply about this.

As I stated, though, if the resource being protected is either a distributed system itself, or a system that cannot support fencing logic, this failure mode is difficult or impossible to prevent. The frequency of failure should be kept in mind here: most services can probably guarantee 99.99% uptime against the likelihood of 5 minute GC pauses.