Hacker News new | ask | show | jobs
by narsil 4031 days ago
We solve the bootstrapping problem with an internal ELB instead.

Autoscaling Groups can be configured to have instances join multiple ELBs. We have one be the regular ELB to access the instances with, and the other is an internal ELB that only allows connections from instances in the cluster to other instances in the cluster on the etcd port (controlled via security groups).

When an instance comes up, it adds itself to the cluster via the internal ELB's hostname. The hostname is set in Route 53.

The biggest issues we've been having with etcd continue to be simultaneous reboots and/or joins to the cluster. It would also be great if the membership timeout feature that used to exist in 0.4 made its way back in. Right now, each member has to be explicitly removed rather than eventually timing out if it hasn't joined back in.

Looking forward to hear any other approaches folks have taken.