| HN Mirror

I think you are probably confused. There are many beta features in Kubernetes and they are enabled by default. For a long time, very critical features could remain beta for years, such as all of Ingress and all of CronJob.

- https://kubernetes.io/blog/2020/08/21/moving-forward-from-be...

One of the big issues with the GKE SLA is that your organization must only consume Kubernetes from the Stable channel, but given the large amount of enabled-by-default, critical beta features, many (probably most) production deployments of Kubernetes rely on non-stable channels intentionally for the sake of critical beta features that have been de facto production features. In my company for example, we must run a slightly older version of Kubernetes and upgrades are very slow, so we are way behind the stable channel with no way to upgrade fast unless the stable channel supports all sorts of enabled-by-default beta features and older versions. So we could never run a hybrid cloud with GKE, it would violate the SLA restrictions from first principles. This has created a nasty, painful rift in my org between the on-prem Kubernetes and the cloud (essentially useless) GKE Kubernetes.

Beyond this there are other features that are critical and uncovered, like multi-region ingress for example. We operate some very very large data ingestion services for customers and we absolutely need a higher SLA uptime on it than what a single region offers in GCP. So we have to operate multi-region ingress, but all of the non-Anthos solutions are no longer supported by GCP, and void out the individual region SLAs. It’s madness.

On top of all this, Google does not actually publish clear lists of features that are or are not covered by the SLA. The way it’s worded relies solely on the Kubernetes alpha / beta / GA channels, but nothing actually ties Google legally to that. They can arbitrarily define the SLA terms to mean whatever they want it to mean at any time. While you likely can’t avoid a cloud provider with that freedom, at least you could expect them to actually publish and document it.

> I’ve also always been able to assign permissions to a user, group, or service account. When have you not been able to do so?

Please check again in my comment. I mentioned specific examples (user-based, not service account-based, workflows in Dataproc, for example), where it’s not possible in GCP, as in the product itself disallows it. It’s not an issue of me or you or anyone being able to create IAM policy or service accounts. It’s an issue that different products within GCP fundamentally disallow some auth workflows (like Dataproc cluster creation being fundamentally disallowed for user-based auth workflows) that then force you to manage multiple different auth flow patterns even within the same user workflow (for example, user-based auth flows for GCE VMs but impersonating service accounts for the exact same steps for a Dataproc cluster), leading to much more overhead, more inscrutable errors, more round trips through security approval. The issue is the poorness of the product design, not some general inability for a user to figure out a service account.