Hacker News new | ask | show | jobs
by dx034 3502 days ago
Most startups hope that they'll suddenly need to increase capacity by 100x, but it nearly never happens. Most vendors can provide dedicated servers within a few minutes (if you don't order too many at once), so scaling is still possible in the vast majority of cases.

Even if you always have to scale up for 1-2 hours per day, using dedicated hardware that's idle the rest of the day is probably cheaper in most cases.

3 comments

Oh, for sure. A lot of startups don't need that ability. I work for a pretty infra heavy startup so AWS is simply required at this point. But we've hit AWS capacity limits during the worst times (one of our clusters processing 20k events/sec hit 100% utilization) and they literally had no capacity left for that instance type. It's not a perfect thing all the time.

But in the end, the pros significantly outweigh the cons. Our resource consumption is naturally extremely elastic. While we'll always need to slightly over-provision to maintain some headroom, adding/removing nodes throughout the variance saves quite a bit of $.

There are other benefits also:

1. You can get started for dirt-cheap or in some cases, free

2. There's a common API for requesting new instances and performing maintenance tasks

3. There are extra services available to help build your apps such as SES, S3, and RDS to name but a few I found very helpful.

I'm not saying anything in this thread is wrong. But in software engineering, we say "write the code that only you can write", which is a suggestion (but not a rule) to use pre-built libraries instead of trying to make your own. Perhaps we should also say, "run the instances that only you can run".

>2. There's a common API for requesting new instances and performing maintenance tasks

Only true if you commit to vendor lock-in. If you use a higher-level cloud agnostic library, then it likely works with openstack as well so you can manage on-prem and off-prem instances the same.

At a high enough scale, you have a lock-in _somewhere_. Spending time trying to abstract yourself from any lock-in can be wasteful.
You can also rent VPS servers that are still cheaper than AWS temporarily and add them to the cluster whilst waiting for dedicated hardware.
Unfortunately, mixing and matching ends up really complicating things especially with security in mind. Many people run within a VPC and bridging to another private network is, well, I don't really want to think about it at this time.
We've found OpenVPN to be our friend here: create an overlay network that doesn't really care if nodes are bare metal or "cloud".
I thought about that too, but as far as I see with OpenVPN you have the single OpenVPN server as single point of failure and all the traffic goes through the server, which quickly becomes a chokepoint. If I needed this again, I'd try out tinc first. It does not appear to have the single point of failure issue.
We have multiple standby servers to prevent the SPOF issue.

One problem we HAVE seen is a reduction in maximum bandwidth. Since we're CPU limited, however, it hasn't really been an issue.

That's the thing - it is much easier nowadays. Kubernetes requires your containers to run on flat shared networking namespace, so your new machine joins that network. It is like running within VPC. Software like Rancher makes the process of adding new server a matter of executing a one liner on server.