Hacker News new | ask | show | jobs
by btown 916 days ago
GKE Autopilot is an attractive option here if you don't want to worry about node utilization and provisioning. Effectively you have an on-demand infinitely-sized k8s cluster that scales up and down as you need new pods. Some caveats, but it's an incredible onramp if you're coming from a Heroku or similar PaaS and don't want to worry about the infrastructure side of things: Github Actions building images and deploying a Helm chart to GKE Autopilot is a remarkable friendly yet customizable stack. Google should absolutely promote it more than it does. https://cloud.google.com/kubernetes-engine/docs/concepts/aut...
4 comments

Unfortunately last I checked the compute pricing for GKE autopilot was almost double, so if you can beat 50% utilization, you might as well just keep the under-utilized Node around.
If this is “free GKE autopilot” (autopilot billed at the same price as regular Fly Machine compute), then that changes the way I think about Fly’s basic compute pricing a lot.

I would think they should highlight that a lot more in the product announcement!

Say more! What should we highlight more?
As someone not familiar with Fly's offering (but very interested for the same reasons as the post you're replying to!), a couple things come to mind if you're looking at convincing people familiar with k8s to move workloads here:

- https://fly.io/docs/ doesn't show any results when searching kubernetes or k8s or k3s.

- https://fly.io/blog/fks/ is self-admittedly snarky but also doesn't provide details about the product itself. It jumps straight into technical details - and while I like the openness about fault tolerance, there's no paragraph after the intro about what Fly Kubernetes is.

- What exactly does the combination of k3s and virtual-kubelet provide compared to standard k8s? Does it provide secret and confmap storage and namespaces and all those expected things? Can we run things like the Kubernetes dashboard? cert-manager? nginx-ingress?

- On that note, what's the ingress story in general? Is Fly automatically routing traffic to the k8s cluster based on the ingress declarations? Are there limitations? Where are they documented?

- Most people running k8s will have fault-tolerant workloads, but reasonable expectations for pod lifetime and reliability of underlying "hardware" are nonetheless important. If I'm migrating from EKS or GKE and want to run a 24/7 background process, can I expect it to keep running on the same Fly Machine for weeks or months until updated? Or are there limits here? (This might be better documented for Fly Machine but it's worth documenting specifically in this context.)

Absolutely understand that this is an experimental work in progress. It's really cool work! But it's also impossible to even justify playing with as an experiment, with so many unanswered questions about where hard caps in the functionality may be hit.

If I use GKE or any other standard Kubernetes offering (excluding GKE autopilot for now), if I have a variable workload and I want Node-level autoscaling, I will probably pay between 1.5x-2.5x in compute costs above what my Pod requests sum to because of difficulty with Node utilization.

It seems like with FKS, my pods will map directly to Fly Machines billing, and so there’s no compute that I’m paying for but not using

GKE Autopilot is pretty much useless, very few cases where it actually turns out cheaper than simply using Cluster Autoscaler + Node autoprovisioning. Not only is the pricing absolutely absurd, they don't even allow normal K8s bursting behavior (requests need to be equal to limits) which means you not only end up paying more than regular K8s cluster but now also need to highly overprovision your pods
Why would you use GKE Autopilot over Cloud Run?
Cloud Run is great if you just need to deploy a few services and expose their endpoints, and don't have a particularly complex backend service architecture.

But with more complex architectures, you'll end up implementing a sort of GKE-like layer over Cloud Run, at which point GKE would probably make more sense.

GKE lets you shell into containers, run all different kinds of workloads (e.g. no need for a separate "Cloud Tasks" system), supports stateful workloads, provides a standardized language for defining and deploying resources of all kinds (the k8s resource definition language), and as such integrates with standard gitops deployment systems such as ArgoCD.

My understanding is that Cloud Run is not suitable for stateful workloads (databases, etc.)