Hacker News new | ask | show | jobs
by abhay 5702 days ago
No one denies that capacity planning is hard. There are books written on the subject. The points you make are exactly the reason why you need to do capacity planning and plan for mitigating failures. If you aren't planning on 2x (in fact more) growth then I'm confused as to what kind of growth you really expect in your service.

If you aren't giving yourself room for expected and unexpected loads, you're doing it wrong. Add capacity and load testing to your process.

3 comments

If you aren't giving yourself room for expected and -->unexpected<-- loads, you're doing it wrong.

You're using that word, I'm not sure it means what you think it means.

Over here in the real world, many applications (and notably web-applications) have one thing in common: They change all the time.

Your capacity plan from October might have been amazingly accurate for the software that was deployed and the load signature that was observed then.

Sadly now, in November, we have these two new features that hit the database quite hard. Plus, to add insult to injury, there's another old feature (that we had basically written off already) that is suddenly gaining immense popularity - and nobody can really tell how far that will go.

Sound familiar?

If you work on systems where you have the occasional 2x spike in traffic or planning for 2x capacity requirements in the future is easy then you don't have the same problems as suhail has.

I work in advertising for example. We could have 10 partners at 1x. Add 10 more and be at 1.1x or 2x, then add a large partner and be at 7x. There isn't a pattern to when we get partners from any of these groups but when we get them they need to go live as quickly as possible and sourcing and prepping hardware in situations like that isn't feasible. Nor is it feasible to have hardware on standby for the occasional 7x partner since you don't know when they are coming along and they could end up being a 10x partner.

Capacity planning isn't just hard, it is costly. You have to profile every new version of your app, and every new version of the software you depend on. You have to update your planning models with that data, and then you have to provision extra hardware to handle whatever traffic spikes you think you'll be faced with within your planning window. Most of the time, those resources will be idle, but you will still be paying for them. Plus in the face of an extraordinary event, you'll be giving users a degraded experience.

Using "the cloud" doesn't solve all those problems but your costs can track your needs more closely, and with less up-front investment. Rather than carefully planning revisions to your infrastructure you can build a new one test it, cut over to it, and then ditch the old one.

You should still profile your app under load so you can be confident that you can indeed scale up easily, but even that is easier. You can bring up a full-scale version to test for a day and then take it down again.

I'm not against capacity planning, but it has it's time and place.