| Any chance you could be more specific as to what you feel is missing in the book? Granted, "distributed systems" is a enormous topic that no book can cover fully, but I have tried to cover things like: - key papers (Lamport; Fischer, Lynch and Patterson; Chandra and Toueg etc.) - topics relevant to highly successful commercial systems (e.g. 2PC => *SQL systems, Paxos => GFS/Chubby, ZAB => Zookeeper, Dynamo => Riak/Voldemort/Cassandra) - and recent topics such as CRDTs and the CALM theorem. Having a sense of how time, consistency and fault tolerance have been explained and handled is (I think) a prerequisite to more advanced topics, but I'd be interested in hearing what parts you'd feel need improvement because some day (~ some years from now) - I will revise the book and it would be nice to have a solid list of issues to revise. |
Your book is focusing on a pretty narrow part of distributed computing. I would rename it "Managing State in Distributed Systems", or "Distributed Storage Systems". Your examples are Bigtable and Dynamo, which fall in this category.
The book seems to be aimed at sort of a "beginning" audience. But the topics are inappropriate for a beginning audience, and skewed for an expert audience.
Real distributed systems try to be stateless wherever possible. You need "big computer science" to manage state in distributed systems, but most code in a distributed system should not manage state. These techniques should be confined to specialized storage systems.
Here are some examples of real world distributed systems that don't use the described techniques to manage state:
The title seems to imply a practical bent, but it seems more like a collection of ideas (which are important and interesting, but not really what engineers need to know. IMO the #1 skill for distributed computing is to be competent at BOTH programming a single computer and at system administration).If I wanted to be harsh, I would say it looks like you read a bunch of stuff and didn't work with it or implement it? At the very least, the ideas don't seem to be put in the context of commonly deployed distributed systems.
People need to understand these simpler, more robust, and more performant techniques, and how to apply them to their specific problem domain, rather than blindly throwing consensus at every problem (which is a disturbing trend I've seen).