|
|
|
|
|
by cle
2438 days ago
|
|
> Then when a reader connects, instead of connecting directly to the nsqlookupd discovery service, the reader connects to a proxy. The proxy has two jobs. One is to cache lookup requests, but the other is to return only in-zone nsqd instances for zone-aware clients. > Our forwarders that read from NSQ are then configured as one of these zone-aware clients. We run three copies of the service (one for each zone), and then have each send traffic only to the service in its zone. Isn't this the default behavior of ELB/NLB to begin with? Why not just configure the zone-aware clients to call zonal LBs, instead of hosting your own LB? Same with Consul. I'm not understanding what benefit Segment gets from using Consul vs. calling EC2 Metadata API to discover the AZ and then calling the appropriate zonal LB endpoint...that's not hard to do and avoids many extra dimensions of operational complexity. It's also unclear to me how all this migration to intra-AZ routing affects Segment's resilience to AZ outages. |
|
Beyond that, ELBs have a significant cost if you are running multiple for each internal service you might have, and the API is slow and cumbersome compared to dealing with Consul's service-centric API. From an operations POV, Consul's ACL system is also a lot more flexible than what AWS IAM can provide. So you can be sure your services are limited in what they can claim to be and what gets set up on their behalf. Whereas if you want to automate creation and configuration of ELBs, you are going to have to either grant more access than you really want or you'll have to abstract that behind another service that you have to write.
As for AZ outages... in practice, a cross-AZ system is often just as vulnerable to problems from the outage of a particular AZ, especially if any autoscaling is involved. AWS's tools around this are severely lacking, despite what they tell us about resiliency best practices. But it all depends on the architecture and mostly the data layer.