As far as I can tell, here's my summary of the architecture I just read.
For service to service communication, they've deployed client-side load balancing & discovery using per-client HAProxy instances. Rather than have every client polling every possible service for health, Zookeeper is used as an endpoint status repository. The HAProxy configs are kept up to date using a tool called "Synapse" that queries ZK. ZK is kept up-to-date by their own health check service, "Nerve".
I've developed similar myself, using LVS, for a private Australian CDN. Nice to see the model generalised, robustified and open-sourced. There are possible issues relating to work levelling and spike management, but if it's working for AirBNB, great.
If you needed to describe this in an enterprise context, I'd tell them it's a distributed SOA broker. That's a gross oversimplification but the buzzword bingo'll satisfy most project managers.
For a middle-aged IT manager I'd say "it's like the Oracle Parallel Server client reliability model, only for web services rather than databases". Again an oversimplification, but they'd get the idea.
With a service-oriented architecture (SOA) you basically break your application into distinct services that run across different machines. But you also need a way to find and connect to those services (service discovery). This provides a way of doing that using ZooKeeper and local HAProxy processes.
So I register a new service (or process) to run on metal using Chef. Then Chef installs the service and updates the nerve config running on that metal device. Nerve then keeps a central Zookeeper updated with the status of its local services? Synapse checks for available services in zookeeper that your app may need to use, and then configures a local load balancer for any requests you make?
That's the idea! If you're using Chef, check out the cookbook for SmartStack; you keep a small hash of configuration information per service, and we take care of the rest. https://github.com/airbnb/smartstack-cookbook
For service to service communication, they've deployed client-side load balancing & discovery using per-client HAProxy instances. Rather than have every client polling every possible service for health, Zookeeper is used as an endpoint status repository. The HAProxy configs are kept up to date using a tool called "Synapse" that queries ZK. ZK is kept up-to-date by their own health check service, "Nerve".
I've developed similar myself, using LVS, for a private Australian CDN. Nice to see the model generalised, robustified and open-sourced. There are possible issues relating to work levelling and spike management, but if it's working for AirBNB, great.
If you needed to describe this in an enterprise context, I'd tell them it's a distributed SOA broker. That's a gross oversimplification but the buzzword bingo'll satisfy most project managers.
For a middle-aged IT manager I'd say "it's like the Oracle Parallel Server client reliability model, only for web services rather than databases". Again an oversimplification, but they'd get the idea.