| HN Mirror

I think this is part of the problem. We're talking about the cloud because "everyone" think they a) need autoscaling (some do, many don't, and even fewer would need it if they weren't paying inflated cloud costs), b) never consider that they can keep base load on servers provisioned one way and auto-scale additional capacity only, and actually considering that tends to change the economics dramatically.

I've set up multiple hybrid setups over the years, and what I've consistently found is that we can provision 2x-3x (more if egress is high) the amount of server capacity for the same price with managed hosting providers or in colo'ed environments than with cloud providers. That's fully loaded cost including rates for contracts for devops etc..

Very few people need to auto-scale up more than that. But since most people still want orchestration, it tends to cost little to set up their system so that if they have a spike, they can scale up extra capacity in a cloud. And in doing so, they can cut the amount of hardware they provide for the base load to whatever is cheapest.

The first times I did this, I was fully convinced going in that this would mean we'd set the base load around the lowest utilization over a typical 24/7 cycle, and spin up some cloud instances during daily peaks etc.

In practice, after actually testing what pays for a given scenario, I've yet to see that (scaling up/down for a typical 24/7 cycle or 8/5) pay off, though I'm sure it can for some people. .

Managed servers proved in actual, real-life usage scenarios, to be sufficiently cheaper that unless your spikes were very brief and sharp[1], it was cheaper to provision enough base capacity to handle most or all of the normal daily spikes, and what a hybrid setup bought us was the freedom to not overprovision for "what-if" scenarios.

That effect was significant enough that even e.g. SaaS services used almost entirely by office staff within a single time zone often do not save on auto-scaling vs. non-cloud servers scaled for peak use, because the 8-10 hour window of use that creates is far too long - depending on your specific cloud cost and what your cheapest alternative is, it may vary, but I've rarely found it pays to spin up cloud resources for spikes that are on average any longer than 4-6 hours in a day, and that tends to rule out most "normal" cyclical use, especially as you can often adjust "cronjob heavy" parts of your workload to fall outside of the window, for example, to even out the load.

Auto-scaling absolutely pays at far smaller variations in load if your only option is to have all your load in a cloud environment, but even then I see a lot of people resort to auto-scaling before they've even though of cutting the cost of their base load by e.g. ensuring they use reserved instances where it makes sense etc., or negotiating. Often ticking those boxes will have a much larger impact.

By all means ensure your system is built so that it can handle auto-scaling gracefully, though - it will benefit you whether or not you end up making much use of it.

[1] As an example where we got "close", one company I worked had several clients that did large e-mail sends for restaurant chains that often included massive discounts. On the e-mails with highest open rates, you'd then very predictably get traffic spikes from 8:00-8:15, 9:00-9:15, and 10:00-10:15 that were massively, as people checked their e-mail when they got into the office, with the 9:00-9:15 peak being several times higher than their normal daily use. If they had been hosting this themselves, it'd have paid to auto-scale into a cloud env. to handle those spikes, especially as they didn't send such campaigns every day. In our case, most of our other customers had reasonably quiet mornings and so we can could overprovision VM's from our base capacity for them at no extra cost to us. But this was also a rare exception.