| I'm an engineer at Cloudflare, and I work on Unimog (the system in question). You are right that even balancing of utilization across servers with different hardware is not necessarily the optimal strategy. But keeping faster machines busy while slower machines are idle would not be better. This is because the time to service a request is only partly determined by the time it takes while being processed on a CPU somewhere. It's also determined by the time that the request has to wait to get hold of a CPU (which can happen at many points in the processing of a request). As the utilization of a server gets higher, it becomes more likely that requests on that server will end up waiting in a queue at some point (queuing theory comes into play, so the effects are very non-linear). Furthermore, most of the increase in server performance in the last 10 years has been due to adding more cores, and non-core improvements (e.g. cache sizes). Single thread performance has increased, but more modestly. Putting those things together, if you have an old server that is almost idle, and a new server that is busy, then a connection to the old server will actually see better performance. There are other factors to consider. The most important duty of Unimog is to ensure that when the demand on a data center approaches its capacity, no server becomes overloaded (i.e. its utilization goes above some threshold where response latency starts to degrade rapidly). Most of the time, our data centers have a good margin of spare capacity, and so it would be possible to avoid overloading servers without needing to balance the load evenly. But we still need to be confident that if there is a sudden burst of demand on one of our data centers, it will be balanced evenly. The easiest way to demonstrate that is to balance the load evenly long before it becomes strictly necessary. That way, if the ongoing evolution of our hardware and software stack introduces some new challenge to balancing the load evenly, it will be relatively easy to diagnose it and get it addressed. So, even load balancing might not be the optimal strategy, but it is a good and simple one. It's the approach we use today, but we've discussed more sophisticated approaches, and at some point we might revisit this. |
I'm sure your system has its benefits, I just get triggered by "load balancing" since it is so pervasive while also being a highly misleading and defective metaphor.