Hacker News new | ask | show | jobs
by KenanSulayman 2877 days ago
I have very successfully in the past and still do use HAProxy as level 4 LB. It's one of the fastest to my knowledge. I have used HAProxy as entry to big Mesos clusters without any issue before.

One example of using HAProxy as L4 LB instead of letting it do the termination is when it is proxying TLS traffic from and to multiple backends. Or Websocket. Or even as bastion LB for SSH should one bastion go down.

1 comments

It's not that HAProxy doesn't do L4. As I said, projects like GLB solves how to make the load balancer itself redundant; how to load balance the load balancer, so to speak.
For the cost of a ton of added complexity though? What do you get out of this solution that other solutions don't provide? Say DNS load-balancing, VRRP, CARP, or any other HA solution.
Not for most companies, but Github's scalability requirements are pretty extensive, and really cry out for something more sophisticated than the technologies you mention.

This stuff isn't exactly new; it's essentially the Maglev system described by Google in a 2016 paper. Other companies are now catching up to Google (which is of course 2+ years ahead).

well you can't have 100% redudancy without a virtual ip or bgp. so basically glb-director is the same as just using haproxy + bgp. (bgp can basically do anycast/ecmp multipath really easily. well you still need redudant network routers.)
basically glb-director is the same as just using haproxy + bgp

BGP (really ECMP) doesn't handle failures gracefully; that's the benefit of GLB.

Wouldn't it be possible to use DNS for this, with multiple A entries per LB, a TTL of 30 or 60? And remove unhealthy servers from the list? That would even come with IPv6 support.

Then you could address the LB with an address like some-service.lb.intranet and just use that where ever you would use the original service.

Designs such as GLB can (I haven't looked deeply enough in GLB specifically to see if they can do it or not, but I would assume so) handle director level failures mid-flow, i.e. connection won't be interrupted even if one them dies (packet losses are still likely, but TCP will take care of them). That allows a lot faster recovery than solutions that depend on client's DNS settings.

Additionally DNS will leave your load balancing at the mercy of ISPs DNS server settings. At least in the past it wasn't exactly unheard of that ISPs only cached single A entry so all of their clients would be directed to single server.

That said, DNS based load balancing is generally good enough solution for most of people.

Problem here is you assume that every client honors the TTL. That is a very bad assumption to make.
DNS failover works well in practice.
DNS failover looks like a neat idea, but does not work well that good. Until a new DNS entry propagates it could take a really really long time. also using anycast/ecmp via bgp means that you have a single ip that is highly redudant because it can be backed by many servers.