Hacker News new | ask | show | jobs
by fizx 4442 days ago
I don't know that that's necessarily a good idea.

As you (perhaps automatically) expand and collapse the cluster, you'll need to make sure to communicate to all nodes what the new cluster size is. If some nodes don't know the correct quorum count, split-brain!

Also, coordination services are typically critical, so its important to isolate from to the bugs in the adhoc code you're writing for your web tier, a crazy query in your database, etc.

It's much easier and safer in practice to just have 3 or 5 nodes running the coordination in isolation.

Edit: more reasons -- It's easier to deploy a coordination service to 5 nodes than 500. It's easier to debug 5 nodes than 500.

1 comments

I probably should have said that it "can" run on any node. Yes, currently it does run on every node, but their roadmap doesn't have the requirement that every node be actively participating in elections.

I'm sure that you have seen fleets of dedicated Zookeeper nodes. I rather like that etcd is simply a service that can run on any node, and does not require a separate role-specific fleet of servers just to do coordination. That was the point I was attempting to make.