Hacker News new | ask | show | jobs
by antoncohen 3104 days ago
I disagree that Jenkins is king for this. Jenkins is a single point of failure, is isn't a highly available distributed scheduler. It is a single master with slaves. While it is easy to configure Jenkins jobs with code (Job Builder, Job DSL, Jenkinsfiles), it is a pain to manage Jenkins itself with code. Plugins, authentication, all the non-job configuration, that is usually done via the GUI.

Saying Jenkins can be configured in a day, to the degree that Stripe configured Kubernetes (with Puppet), is disingenuous. It would take more than a day to do the configuration management of the slaves, getting the right dependancies for all the jobs.

How to you isolate job executions in Jenkins? In Kubernetes each job inherently isolated in containers. In Jenkins you have a bunch of choices. Do you only run one executer per slave? OK, but then you have a bunch of wasted capacity some of the time, and not enough capacity other times. You could dynamically provision EC2 instances to scale capacity, but then you need a setup to bake your slave AMIs, and you have potentially added ~3 minutes to jobs for EC2 provisioning. You can run the jobs in Docker containers on the slaves, that will probably get you better bin packing, but it doesn't have resource management in the way Kubernetes does, so you could easily overload a slave (leading to failure) while other slaves are underutilized.

Doing Jenkins right is not easy, there are solutions to all the problems, but isn't just fire it up and it works.

Stripe was running Chronos before, which is a Mesos scheduler. So they have experience with distributed cluster schedulers. They were probably comfortable with the idea of Kubernetes.

They mention this as a first step to using Kubernetes for other things. So they probably wanted to used Kubernetes for other things, and this seemed like a low risk way to get experience with it. Just like GitHub started using Kubernetes for their internal review-lab to get comfortable with it before moving to riskier things (https://githubengineering.com/kubernetes-at-github/).

2 comments

> it is a pain to manage Jenkins itself with code. Plugins, authentication, all the non-job configuration, that is usually done via the GUI.

This is not true, all the configuration is scriptable via groovy scripts. We run bunch of groovy startup scripts that configure everything post launch. There is an effort to support this better[1] by jenkins team.

> How to you isolate job executions in Jenkins? In Kubernetes each job inherently isolated in containers.

We run one docker container/build on docker swarm. Each build gets its own isolated/clean environment. There is no EC2 provisioning ect. We already own and maintain docker swarm setup we just run jenkins/jenkins agents on it. I assume if you are using kubernetes it would be similar setup.

> Jenkins is a single point of failure, is isn't a highly available distributed scheduler.

I agree with this to an extent. If you are running jenkins on scheduler it can be rescheduled but you inflight jobs are dead.

1. https://github.com/jenkinsci/configuration-as-code-plugin

> > it is a pain to manage Jenkins itself with code

> This is not true, all the configuration is scriptable via groovy scripts. [...] There is an effort to support this better[1] by jenkins team

The link you gave confirms it by saying managing Jenkins code "require you know Jenkins internals, and are confident in writing groovy scripts". Neither GUI's (like the one shown in your link) nor procedural languages (like Apache Groovy, still procedural even though its collection API is crippled for Jenkins pipelines) are very good for configuring software. Nor is an unreadable declarative language (like XML).

A readable declarative language (like YAML, as shown in your link) is the solution. Languages like Groovy were an over-reaction against unreadable XML in the Java ecosystem. The correct solution is to switch from an unreadable to a readable declarative language for configuring software.

> Languages like Groovy were an over-reaction against unreadable XML in the Java ecosystem. The correct solution is to switch from an unreadable to a readable declarative language for configuring software.

I somewhat agree with you. Unfortunately Jenkins team seems to have bet in the opposite direction by going full groovy https://github.com/jenkinsci/pipeline-examples

Since when does docker swarm support build steps pre launch?

Are you sure you're not just using plain docker on docker swarm nodes?

> isn't a highly available distributed scheduler.

Bingo! thats the point, its a cron replacement.

But to tackle your first point, K8s might be distributed, its not inherently reliable. Yeah sure people run it in production, but there are a myriad of bugs that you bump into. I've lost clusters due to tiny issues that ran rampant. Something that I've not had in other cluster or grid engine systems.

if we are talking AWS, then having the jenkins master in an auto scaling group with decent monitoring sorts out most of your uptime issues,

The reason I say it'd take a day to configure jenkins is because the jobs have already been setup in cronos. It should literally be a copy-pasta job. All the hard work of figuring out which jobs are box killers, which can share, which are a bit sticky has been done already, all thats changing is the execution system.

What level of isolation are you after, and for what purpose? if jobs can't live on the same box, then thats almost certainly bad job design. (yes there are exceptions, but unbounded memory or CPU usage is just nasty.) There maybe need for regulatory isolation, but containers are not currently recognised as isolated for that purpose.