Hacker News new | ask | show | jobs
by mschuster91 2 days ago
I somewhat agree with you... but it's not like you don't need some actual experts who know what they're doing, especially when stuff goes bonkers and it will go bonkers.

Even on AWS EKS, you will run into bullshit with their network overlay. Egress policies are a mess (at least half a year ago, you were not able to say something like "allow pod A to egress traffic to service (!) B" despite a service resolving down to an IP address in the end.

And that's before going into the unholy mess that is getting connectivity to and from the external world to your cluster. Cloudfront, ACM certificates, ALB, ALB-EKS integration, Route53, Route53-EKS integration, EFS, EFS-EKS integration, EBS, EBS-EKS integration, RDS, RDS-EKS integration, IAM-EKS integration, SSM, SSM-EKS integration, autoscaling... and if you want more pain and don't already wince, try setting that up across regions or, as I had to do once, across account boundaries.

Kubernetes is powerful. But do not make the mistake of assuming it's easy to get started with, at least on the admin side. Even if you got prior AWS experience, getting it all integrated into EKS so you don't have to deal with Terraform and helm/k8s for a full deployment of a piece of software will take you an awful lot of time.

For users though? It's a breeze, I will admit as much. Everything down to the firewall rules can be encoded in k8s spec files.

1 comments

If you struggle with any of that (a lot of what you listed is not strictly necessary to running managed kubernetes, specifically EKS) you are also going to struggle with a lot of other things on AWS, or wrangling a VM setup at any kind of scale.
> a lot of what you listed is not strictly necessary to running managed kubernetes, specifically EKS

Oh it's not necessary per se but if you want to host a web service with any sort of state and not having to do stuff in parallel either by hand or by terraform, I'd consider the integrations pretty vital.

It's easy enough (well, it's still addons whose versions you have to keep updated each on their own) once it is set up, but getting to the point where you have something reproducibly running for the first time is annoying as hell.

I think the best supported and most mature pattern on most big cloud providers is precisely

> do stuff in parallel either by hand or by terraform

…specifically by terraform. Making k8s own the provisioning and management of external infrastructure on principle (as opposed to when that makes sense, e.g. load balancers/gateway/CSI providers) is not a good approach. Sure, it feels unified, but the cost of unification is incredibly not worth it.

> Sure, it feels unified, but the cost of unification is incredibly not worth it.

That's the cost I was talking about. It is indeed annoying and time-consuming to get it set-up once, but once it works... it is amazing for developers to have the ability to spin up a completely identical to prod environment for a hotfix branch to test stuff out, with no involvement from ops or anyone else.

And also, it's much easier IMHO to get a mental image of how a system is constructed when it's one architecture - no matter if it's k8s/helm or Terraform. But as soon as you have both in the mix, you get friction issues, you have to pass stuff from Terraform to Helm or vice versa... and may God have mercy upon you if you also have Ansible in the mess, I had to do that once for a piece of proprietary dependency that would not have been supported by the vendor in any place other than a SLES bare metal server.

Yea, I used to believe this too, and still sort of agree - I got so tired of the argument in maintaining k8s infra in terraform I gave up and wrote what is essentially a terraform wrapper module around helm. The charts break terraform quite a bit sometimes, so you have to keep it simple, and god help you if you want to use CRD's, hashicorp providers have the notion no one actually needs those.

I had dismal hopes of it working for very long but it's remained mostly untouched going on 3 years now which really surprised me, and it's been easy to work with. I think if you run EKS resources like node groups, autoscalers, LB type of resources in the same state file as helm deployments you're going to have a very bad time though.

> I think if you run EKS resources like node groups, autoscalers, LB type of resources in the same state file as helm deployments you're going to have a very bad time though.

There's no alternative to that anyway... otherwise even a terraform apply -refresh=false will quickly take well over 10 minutes.

So they're holding it wrong?