Hacker News new | ask | show | jobs
by hashtree 4996 days ago
I'd really love to jump on EC2, but every time I run the numbers it doesn't add up for my usage.

I currently colocate all my servers and I wanted to figure out just how much it might cost to potentially switch over to EC2. After much digging and benchmarking, it seems that an single ECU is roughly equivalent to 350 to 400 points on PassMark. With this information and load metrics, it is pretty easy to determine what kind of ECUs I might need to switch over (as RAM and disk are pretty straight forward): http://www.cpubenchmark.net/cpu_list.php.

Came to the same conclusion as I did a few years ago. For my scenario (about a rack of servers, established business, 24/7 usage, capacity to handle for a 10-fold increase in usage (and much more within a 2 hour window))... I save roughly $170,000 over 3 years doing it all (server costs included). This is with 3-year reserved instances.

It should be noted that I build our servers from the ground up and do all the ops.

2 comments

Funny, this is exactly the problem I'm working on with my new project, that I just started rolling out this week (shameless plug): https://uptano.com

I've created setups like yours a bunch of times (anywhere from 10-1000 servers) and always check EC2 to see if it's a viable alternative. The messy details of proper facilities management get old real quick. But, as you say, the math just doesn't work.

With existing companies you end up paying huge premiums to rent virtual instances on heavily shared hardware. You're giving up a proper network (with internal and external connections) and persistent high performance disk I/O. It's just not a great deal.

I'm trying to get it much closer to the pricing of doing it yourself, while keeping the convenience of not having to actually do it all yourself.

Would love to get some feedback from you, if you're willing. No email in your profile. Mine is jake@uptano.com.

If you include how much your salary, bonuses, equity and health care costs the business, does it change your calculations?
It does.

The setup took a couple months to research, build, and make "perfect". However, ongoing it takes little time to maintain (less than an hour a week, if averaged). Every three years, I build new servers and place them into service (2 to 3 weeks of time to perform). I also do periodic hardware maintenance roughly every three months (typically 1/4 of a day to perform).

Due to the cost savings, we also are able to do quite a bit of redundancy, such as: dual PSUs, SSDs in RAID 10 on non-SAN servers, RAID-Z2/3 on SAN servers, offsite backups, complete server redundancy, spare servers ready to be slotted (I live an hour from colo), spare parts on hand, even multiple physical colos.

If components are selected carefully (i.e. sharing components between server roles), regular maintenance is performed, and redundancy is ensured on a per component, per server, and per datacenter level, it's not very time intensive or costly.

I am a software engineer by trade, but love the ins and outs of hardware/ops. As such, everything is automated and scripted (that can be). I can raise/move instances in minutes, just like EC2 (currently use XCP).

Even with the research, it still saves roughly 100k per 3 years.

These kinds of numbers scare the shit out of me. Here I am thinking a couple of linodes may cover what I want (am intrigued by uptano (https://uptano.com/) linked above though).

How do I go about estimating my real needs? I mean, I hope you are running some major stuff for money whereby you save $60K a year. Holy shit!

It speaks more towards the outlandish expense of EC2 (for us) than it does the true actual expense.

A few things:

Eliminate the middle men. Who do the small/medium datacenters use to build their custom hosting hardware? It's likely someone like Ma Labs where you get quite a savings over Amazon/Newegg, particularly when you buy components for many servers at a time.

Pay in advance, when it makes sense and is possible. Talk to colo operators, you can likely get a better deal if you pay for 1/3 years up front.

When you build the hardware yourself, you can do things no operator can do for you... tailor it exactly for your own domain needs.

Analyzing your domain needs to define server roles that you might need (e.g. load balancer, app server, relational database, hadoop cluster, nosql database, key/value stores like redis, etc) will lead you to commonalities in hardware/components needs. Now you can develop a few physical server types, order in bulk, and not have to keep so many spare components on hand.

For us, we are able to split out our datacenters by "critical"/"non-critical" for huge cost savings. Our "critical" datacenters host traditional production level servers. Things that MUST have up times of four/five nines. We can get 50Mbps 95th percentile, quarter rack, 10 amp for roughly $400 a month. These are great, but you have to make the most of each U.

We do a lot of machine learning, map/reduce, and general processing. The app needs this, but because I coded for it... if the uptime is, say, 99% and not 99.9999% it has VERY little impact on our end users (think of these as worker dynos). Now, I can have a whole rack here at the office able to handle for 90% of outages without issue. The nice thing about this is, I no longer have to make the most of each U. I can now build servers completely different than I would in a traditional datacenter. It also comes with little added expense to our normal operations (add redundant internet, networking, UPSes, and insurance). I can build a 1u, 25 ECU-equiv, 32GB, SSD based server for ~1.1k. Fill the rack! :)

These sound like excellent pointers. Do you have such large needs because of the type of pages/apps you are serving (streaming, for instance, or heavy analytical processes in the ML), or simply because you have a helluva lot of users?
At the kind of scales that OP is operating at, you are almost certainly going to need a system administration person with either option.