Hacker News new | ask | show | jobs
by strags 2240 days ago
Cost and Latency.

My team and I run the servers for a number of very big videogames. For a high-cpu workload, if you look around at static on-prem hosting and actually do some real performance bencharking, you will find that cloud machines - though convenient - generally cost at least 2x as much per unit performance. Not only that, but cloud will absolutely gouge you on egress bandwidth - leading to a cost multiplier that's closer to 4x, depending on the balance between compute and outbound bandwidth.

That's not to say we don't use the cloud - in fact we use it extensively.

Since you have to pay for static capacity 24/7 - even when your regional players are asleep and the machines are idle, there are some gains to be had by using the right blend of static/elastic - don't plan to cover peaks with 100% static - and spin up the elastic machines when your static capacity is fully consumed. This holds true for anything that results in more usage - a busy weekend, an in-game event, a new piece of downloadable content, etc... It's also a great way to deal with not knowing exactly how many players are going to show up on day 1.

Regarding latency, we have machines in many smaller datacenters around the world. We can generally get players far closer to one of our machines than to AWS/GCP/Azure, resulting in better in-game ping, which is super important to us. This will change over time as more and more cloud DCs spring up, but for now we're pretty happy with the blend.