Hacker News new | ask | show | jobs
by aedocw 3643 days ago
Have you compared the cost of maintaining your own CoreOS infrastructure at RAX just for kubernetes to using Googles Container Engine? If your services are all containerized and deployed via k8s to begin with, seems like you wouldn't have much reason to maintain your own infra at that point.
1 comments

Yes, absolutely. We ran a four month experiment on GCE--we built an off-site logging cluster fluentd+Elasticsearch+Kibana. The performance was decent but the cost of RAM and disk are way higher.

I will tell you that the economics are most definitely not there. This is a common misconception amongst the HN crowd in general--that public cloud infra is cheaper. For small footprints, public cloud makes sense but once you get into the larger footprints (300+ instances), it's far cheaper to lease dedicated hardware or DIY in colocation. We're running on approximately 40 dedicated rackmount servers for Openstack and 6 for Kubernetes. To get the equivalent amount of disk and RAM, we would pay 2-3x at AWS or GCE. We could probably cut our cost by an additional 30% by moving what we have to colo but we would lose some flexibility and would have to take on additional headcount.

From a maintainability standpoint, GCE makes Kubernetes easy which is a good thing if you've never run it before. It's not that hard to run it yourself, though. A senior-level systems engineer will be a Kube master after about two months of use. Just guessing, I think it takes about 1/4 of an engineer-week to support our Kube cluster for a week. I think we could grow our cluster 20x without a significant workload increase for our ops team.

We are in the process of automating the last few manual aspects of our Kubernetes infra: load balancing and monitoring. We're building these in the style that we've built the rest of our pipeline: YAML files in a project repo. Simply drop your Datadog-style monitoring/metrics config and your load balancer spec in your project's Github repo and the deployment pipeline will build out your monitoring, metrics, and LB automatically for you.

I'm curious - did you model reserved instance pricing, or on-demand pricing in your comparison? AWS typically charges ~1/3rd the price for a 3-year reserved instance vs. on demand pricing. This comparison would be much more apples to apples if you are purchasing hardware with CapEx that typically depreciates over 3-5 years.
This is really interesting. Are you planning to open source this pipeline automation? I'd be interested.