Hacker News new | ask | show | jobs
by ExoticPearTree 308 days ago
Shower thoughts: since we can do service discovery pretty easily to know when a server was added or removed from a pool, we can also discover a metrics endpoint with a limited set like CPU load, memory load, threads available etc. With a helper process/thread running alongside the loadbalancer main processes, it could populate/update in almost realtime the equivalent of an haproxy stick tables but with much richer information. When the next request hits the loadbalancer, you know “exactly” where to route it for best performance.
1 comments

Author here. Two quick thoughts: 1. As I covered in an earlier part of this series, service discovery is not always easy at scale. High churn, partial failures, and the cost of health checks can make it tricky to get right. 2. Using server-side metrics for load balancing is a great idea. In many setups, feedback is embedded in response headers or health check responses so the LB can make more informed routing decisions. Hodor at LinkedIn is a good example of this in practice: https://www.linkedin.com/blog/engineering/data-management/ho...
I was thinking something along the lines of a “map” with all the backends and their capabilities that would be recomputed every N seconds and atomically switched with the previous one. The LB woukd then be able to decide where to send a request and also have a precomputed backup option in case the first choice would become unavailable. You could also use those metrics to signal that a node needs to be drained of traffic for example, so no more new connections towards it.

I understand the complexities of having a large set of distributed services behind load balancers, I just think there could be a better way of choosing a backend based not only on least requests, TTFB and an OK response from a health check every N seconds.