| HN Mirror

The key part here is "machine utilization" and absolutely there was a ton of effort put into this. I think before my time servers were allocated to projects but even early on in my time at Google Borg had already adopted shared machine usage and therew was a whole system of resource quota implemented via cgroups.

Likewise there have been many optimization projects and they used to call these out at TGIF. No idea if they still do. One I remember was reducing the health checks via UDP for Stubby and given that every single Google product extensively uses Stubby then even a small (5%? I forget) reduction in UDP traffic amounted to 50,000+ cores, which is (and was) absolutely worth doing.

I wouldn't even put latency in the same category as "performance optimization" because often you decrease latency by increasing resource usage. For example, you may send duplicate RPCs and wait for the fastest to reply. That could be double or tripling effort.