Hacker News new | ask | show | jobs
by philips 3314 days ago
A big use case of this that we are thinking about is enabling people to use the etcd Operator[1], which makes it simple to run etcd clusters on Kubernetes, to back their ZooKeeper applications.

The neat thing about the etcd Operator is you can define a cluster and the etcd Operator takes care of normal operations by using the Kubernetes API.

  apiVersion: "etcd.coreos.com/v1beta1"
  kind: "Cluster"
  metadata:
    name: "example-etcd-cluster"
  spec:
    size: 5
    version: "3.1.8"
Pretty neat!

Anyways, the zetcd project is still super young but would love more folks to try it out. As the post says folks have already tried using Kafka, Mesos, and others.

[1] https://coreos.com/blog/introducing-the-etcd-operator.html

3 comments

Who are you targeting with this? The "hesitant" zookeeper folks that already depend on etcd? Are you hoping to unseat cdh here? Pardon the naive question here - I never bump in to k8s selling to traditional enterprise hadoop customers.

I'd also never pick kubernetes for my "from scratch" cluster due to already being reliant on the JVM stack. I actually like the idea of giving an IT department that already understands zookeeper a mesos cluster with DC/OS.

That being said - k8s has a ton of momentum but it seems to be mainly with startups or maybe niche teams (prove me wrong here?) outside of google. It would be great to understand what you guys are looking at for things like this. Right now it feels like k8s and a lot of the other startups in this space like pachyderm are trying to compete with the hadoop ecosystem (which is great! competition forces innovation which is good for the ecosystem as a whole)

Kubernetes is actually getting a solid amount of large-tech and early adopter enterprise deployment. That's still pretty nascent, but it's picking up quickly. Happy to discuss in more details offline the adoption we're seeing.

The reason you don't bump into k8s while selling to Hadoop users is that Hadoop isn't something you'd run on a container-based stack (at least not right now and IMO it wont be). There are lots of Hadoop users who run containers for their application infra (as opposed to data infra). Pachyderm's whole pitch is that containerized data infra can be really powerful and that enterprises will want to unify their stack to all be containerized and k8s is THE answer for the orchestration layer.

P.S. Despite all my opinions above, I actually agree with your initial question around who zetcd is actually targeting. I don't have a clear picture of that.

>"Pachyderm's whole pitch is that containerized data infra can be really powerful and that enterprises will want to unify their stack to all be containerized and k8s is THE answer for the orchestration layer."

Doesn't Pachyderm predate K8s though? Is this a recent development? Have they shifted focus then?

Sure! Feel free to reach out. I'm just commenting on a wider trend I'm seeing with parallels to the hadoop ecosystem popping up written in go that are container based. I agree you don't tend to run hadoop and co on containers. We tend to see the app side as well though. We do both microservices as well as hadoop infra.
Non-startup companies are adopting Kubernetes. You can see some of their stories on the Tectonic Summit website[1]: Ticketmaster, eBay, Concur, SAP, BNY Mellon, MLS, etc.

I will try to reply in the morning in depth on the other points.

[1] https://coreos.com/summit/

Appreciated! I'm wondering if these are just one off teams though? We have "enterprise adoption" for our software but it doesn't mean company wide. One thing hadoop has been able to do is actually get deployed at scale. You can have small teams within companies using k8s for their apps. Some other parts of these companies can be too conservative to actually deploy new tech. The "nascent" adoption usually means innovation labs and 1 off deployments for certain teams.

What I'm trying to gauge here is k8s as an actual "company wide platform". I would love for it to be something I can depend on to be at an enterprise in a few years. It's great technology but still feels like it needs to be beaten up a bit yet.

I work on OpenShift (which is k8s with tenancy) and there's a good mix of "production apps", "dense development clusters", and "single app experimentation" out there. Like all things about the future, it's here, just not evenly distributed.

You'd be surprised how many services you interact with on a daily basis are running on k8s (whole or in part).

It's still early, and many of the adopters today in large companies just happened to be making modernization efforts of their app-dev / app-deploy pipelines and moved to k8s or OpenShift. That said, it's certainly not ubiquitous yet.

This sounds more palatable to me. I definitely know it has traction and I wouldn't be surprised to see it powering quite a few of the bigger services but it still feels like a big part of the earlyadopter phase yet. This is line with what I have seen. I know it's "out there" but it's not exactly "RHEL" yet ;).
So I can run a Kubernetes cluster on Mesos and have the Zookeeper for Mesos deployed on the Kubernetes cluster using the etcd operator and zetcd

Joking, of course.

The funny thing about the situation you describe is that there are real world examples of similar circular dependencies.

I recall GitHub having an issue like that where their build pipeline used Bower which is hosted on GitHub. When shit hit the fan and a build broke the site, they could build the "fix" as Bower didn't work.

My own experience working at CoreOS is that many of our projects exploit self-referentiality as it's a particularly useful property.

Off the top of my head:

- Quay.io, our registry service, is built and deployed by itself

- Clair, our static analysis tool for detecting security vulnerabilities, analyzes itself

- Tectonic, our enterprise Kubernetes distro, is "self-driving" and manages itself

- discovery.etcd.io, a service we run to make it easier to bootstrap new etcd quorums, is just a quorum of etcd nodes

I think you are missing the point. It's like running docker registry on kubernetes.

If for some reason the cluster goes down, bootstrapping it might be a bit difficult.

Yes that's the exact point I was trying to make. Things are fine until they're not, at which point it's surgery and tribal knowledge to fix them.
Of course you are joking, but you should keep in mind, that _in the real world_ you would operate the etcd kubernetes needs also via etcd operator!
Oh, and here is a video explanation of the etcd Operator https://youtu.be/Uf7PiHXqmnw?t=11