"Consul agent takes part in Raft, so doing infrastructure discovery in your whole environment means every system takes part in Raft"
Consul agent runs in two modes - master and client. From the Consul Docs:
"Client - A client is an agent that forwards all RPCs to a server. The client is relatively stateless. The only background activity a client performs is taking part in the LAN gossip pool. This has a minimal resource overhead and consumes only a small amount of network bandwidth."
The client doesn't take part in the consensus protocol. This seems to be the OPs whole reason for choosing etcd over Consul and I don't believe the OPs claim is correct.
I personally think the OPs reason to use etcd is mainly about 2 parts:
1.) The ecosystem it integrates in
2.) Only maintaining 1 SR technology
I think that with the upcoming kubernetes trend, many OPs guys are somehow familiar with the CoreOS stack where etcd fits in very great. The hashicorp toolchain is great, but I've seen many OPs guys only using terraform or vault e.g. and don't build an integrated environment with nomad or atlas.
The CoreOS stacks I've seen are more integrated in platforms than the hashicorp ones (can be a good and a bad thing at once).
>"The hashicorp toolchain is great, but I've seen many OPs guys only using terraform or vault e.g. and don't build an integrated environment with nomad or atlas"
I think this is one of the nice parts about Hashicorp releases though is that they are building blocks and you can choose to use them as standalone pieces - Consul, subsets Consul + Vault or of them in an integrated stack such Nomad and Atlas.
I agree with you though if you are a CoreOS shop it certainly makes sense to choose etcd.
You are correct and I came here to say the same. Additionally I've seen issues working with etcd in that it is a pain to manage, with situations coming about of split cluster issues where the data becomes corrupted and needs to be manually merged together. That is a bad situation to be in, and usually it is better to develop for failure, especially in cloud environments. You can get split-brain problems on any clustered agreement replication, etcd is not a special case.
These failures are the worst case for any technology like that and will always cause a bit pain in production - but since the etcd guys a very open to how they behave in case of failures and how they do quality testing of etcd, I started to trust etcd: https://coreos.com/blog/new-functional-testing-in-etcd/
I have read a blog post about continous testing of consul at high scale too (can't find the link for now), but I can't remember to got this much insights.
(Aside from the issues with reason #1 for not using consul noted below...)
The author says that Atlas is essentially "required" for bootstrapping. We simply use our cloud provider's "list-instances" command (with a filter) so that we don't rely on any third-party and never contact the internet. This is important to us for security and stability, and has kept our cluster running with over 1k members. We don't want to go down when Atlas or etcd's discovery service goes down.
At my work, we have figured out a way to do this completely automated. We use Packer + Ansible for creating the AMI, Terraform to setup AWS and launch 3 Consul instances based on the AMI.
Last part is we have Go program (started with Bash) to do the joining.
It finds all the Consul server nodes (using tags) and then runs `consul join` until it succeeds.
There's a bunch of error checking and timeouts and such to make sure it works correctly.
I'm hoping to post a blog post about this in the next month or so.
To add to this, I run `consul join` against all instances tagged as consul servers with the output of the instance list, on boot, using cloud-init's "per-boot" script area.
I haven't meant that you need to use atlas, but I have never seen a nice automation implementation for a scalable consul cluster that doesn't use atlas. I've also built something around atlas, mostly using hiera in puppet - but since I used atlas and the etcd discovery, everything else looks like a workaround.
What provisioning do you use to integrate with your cloud providers api?
We use consul in our infrastructure, and by default none of our internal hosts have internet access, so we can't use Atlas even if we wanted it. We deploy consul through salt, but information about our consul servers is extracted by our internal CMDB which predates consul.
I'm curious how etcd is bootstrapping? I don't see a better way to do it than using multicasting.
BTW: I absolutely hate that consul (and looks like etcd has this issue as well) is using http for communicating. It's so inefficient to obtain updates about changes these way. Zookeeper which everyone loves to hate (and it wasn't even created for service discovery), did this so much better, you have a single long standing connection where you subscribe what updates you want to receive, it has much lower overhead and is simpler to code with.
Another thing that seems to be lost buy people who promote their service discovery solution is that you don't need to be always consistent and eventually consistent is perfectly fine. You don't really need raft or paxos to do it.
It still feels like a hack instead doing it properly. You're doing long pool instead what you should do in first place, which is push.
Again, I did not work with etcd yet, but in consul because it's "RESTful" i when you monitor multiple services you need to maintain multiple requests.
Edit: Reading more about http 2's frames and pipelining, looks like it's possible to use it similar way it's done in ZK[1]. If GRPC allows that then I suppose it indeed solves this problem.
[1] having a single long standing connection that's not closed after receiving a response. The request frames could be used to place watches and response frames would send the updates to the client.
etcd use bi-directional streams for watchers. One. TCP connections can maintains multiple streams. No matter what you need to keep at least one connection. ZooKeeper is not an exception.
I'm no consul expert, but their only problem i've seen right now is the fact that ID is bound to an IP and that this IP is taken from the first network interface.
I hope they can move to GUIDs or some other solution soon and facilitate consul's use in "dynamic" (at leat from consul's point of view) IP environment, say inside Docker, etc.
So a single node Consul cluster, when rejoining itself with a custom ID, would end up a leader of "itself" and not wait for the previous IP as a follower, forever? :)
I think I'm a little confused by what you are asking - are you wanting to run a consul server and agent on the same instance? You can do that and specify an ID appropriate to the consul service type for multi-tenancy situations of master nodes and client nodes. But that would be up to you given your deployment environments.
Are you wanting to have a single consul service running for the whole environment (as client and server)? If you're running an HA environment you should really have dedicated master nodes and non-voting clients for service discovery on instances. The member states of consul clusters have an IP and ID, are you saying that the IP could be mixed with a different ID (it could in the cloud, but upon connecting to the leader it would update its registered information).
Are you talking about cleaning up old instances registered in the cluster? Or that there could be a single master node which has a broken network connection to the other master nodes, so sees different cluster status information? In that case, you should be monitoring all your services. Always. There are a lot of tools out there which will clean up and correct voting clusters in those situations. We use Sensu and Kubernetes checks / services for a lot of monitoring and automatic reaction to changes in cluster health.
This was actually an issue in previous versions, in (I believe 0.7) after a timeout (reconnect_timeout and reconnect_timeout_wan in options) a node will be completely removed.
Is your concern that you just want it to bind to another interface besides the first one(the default?) If so you can use the -advertise flag.
From the docs:
-advertise - The advertise address is used to change the address that we advertise to other nodes in the cluster. By default, the -bind address is advertised. However, in some cases, there may be a routable address that cannot be bound. This flag enables gossiping a different address to support this. If this address is not routable, the node will be in a constant flapping state as other nodes will treat the non-routability as a failure.
My concern is that while running consul as a single docker container, it will completely get stuck in a follower state even though it's the only node in the cluster if the IP chnages on restart.
It is waiting for the previous IP, who was the leader, to join... My understanding is that it's not solved?
For the zombie problem, Consul has always deregistered dead nodes (and their associated services) automatically after 72 hours. Consul 0.7 added the ability to define a health check attribute called "deregister_critical_service_after" which will deregister a service automatically if a check is failed for longer than this value. This helps clean up services from processes that register themselves that might not get shut down cleanly for whatever reason.
"Consul agent takes part in Raft, so doing infrastructure discovery in your whole environment means every system takes part in Raft"
Consul agent runs in two modes - master and client. From the Consul Docs:
"Client - A client is an agent that forwards all RPCs to a server. The client is relatively stateless. The only background activity a client performs is taking part in the LAN gossip pool. This has a minimal resource overhead and consumes only a small amount of network bandwidth."
The client doesn't take part in the consensus protocol. This seems to be the OPs whole reason for choosing etcd over Consul and I don't believe the OPs claim is correct.