Hacker News new | ask | show | jobs
by joshuaellinger 2193 days ago
I'm in the midst of returning from the cloud to coloc.

It's looking like for about $50K in hardware I get x10 the ram/compute/storage/(internal) network I get from the cloud at around $10K/mo. However, it is taking me about $25K in labor to setup. Hosting costs about $1K for a rack and a decent pipe to the server.

So more than a hobby but not approaching real scale.

The main barrier to using a lambda model is if you have anything that smacks of a DB. It would take me $100K+ to transition away from SQL. If you can do lambda w/o a big retool cost, then it is probably viable. If you are just running a bunch of VMs on the cloud, it is pretty expensive.

...

Interestingly, my main client (a very large company) just went the other direction and moved all its compute to a system that is Spark underneath running on Azure. They are trying to decommission some expensive TerraData instances. So far, it is a mixed bag -- it is a big step forward (for them) on anything that is 'batch-oriented' but 'interactive' performance is dismal.

Price appears to be a big motivation. I always forget that most large enterprises run on exotic stuff with crazy service contracts that makes the cloud look cheap.

1 comments

> Price appears to be a big motivation. I always forget that most large enterprises run on exotic stuff with crazy service contracts that makes the cloud look cheap.

Don't forget the second factor of (not having) the workforce. Physical servers require a bunch of suckers who know physical infrastructure and accept to be oncall 24/7 and deal with DELL/HP/colo full time on top of periodic travels to the datacenter.

Those "suckers" are easy to contract out to, and a lot of colo providers provide managed services or "remote hands" themselves as well.

In practice, when I was responsible for racks in two separate data centres, I spent 3-4 days a year in the data centres (combined); everything else was handled easily via tickets or remotely. Overall I've generally spent less time on devops with hardware in colo's than with cloud setups.

But specifically the amount of on-call work tends to be down to code quality and higher level architecture not whether you host in cloud or colo or rent managed servers - the "low level" problems become part of the noise floor very quickly in any system with reasonable failover.

As opposed to cloud ops suckers who accept 24/7 on call and just sit around panicking when cloud providers are down and the status page is full of lies?
Colo providers have massive outages too so it's the exact same thing in that regard.

If we're talking regular maintenance, like a raid controller or a power supply going bonker. AWS is always accessible, it allows you to realize something is broken and create a new volume or instance in 5 minutes. Whereas with dedicated hardware you might be toast with no remote access and/or no spare parts.

There have been several AWS outages where EBS and EC2 instantiation was outright down, and you could not create new resources for a period of time. AWS is not “always accessible“ unless you’re in their marketing department.

Sure, you know you’re down, you simply can’t do anything about it. Not so with your own hardware, which is why many orgs continue to run their own gear.

> Sure, you know you’re down, you simply can’t do anything about it. Not so with your own hardware

I don't buy it. Everywhere I've worked that colo'd or owned their DC had wider-reaching outages (fewer than cloud, but affecting more systems). Usually to do with power delivery (turns out multi tiered backup power systems are hard to get right) or networking misconfiguration (crash carts and lights-out tools are great, but not if 400 systems need manual attention due to a horrible misconfiguration).

I think folks underestimate the non-core competencies of running a data center. Also often underestimated is the value of running in an environment designed to treat tenants as potential attackers; unlike AWS's fault isolation, when running your own gear it's really easy to accidentally make the whole system so interconnected as to be vulnerable--even if you make only good decisions while setting it up