Hacker News new | ask | show | jobs
by zomglings 1957 days ago
From the article: > I always maintain "Day 0 is easy, Day N is the real test of your skills".

Would be interesting to see how this applies to the author's use of Nomad. It's easy to shit on Kubernetes because of its complexity, but this article seems to be comparing the Nomad Day 0 experience with the Kubenrnetes Day N experience.

I'm firmly of the opinion that you don't need much more than systemd (or equivalent) + SSH for your home server.

1 comments

The Nomad Day N experience is pretty good. I help maintain 4 Nomad clusters running approx 40k Allocations (like Pods) for work. We have basically no problems with Nomad itself, and requires pretty much no day to day intervention. Upgrades are pretty painless too. We've gone from 0.7.x to 0.12.x with these same clusters, and will be going to 1.x soon.

Happy to try to answer specific questions.

Do you run other services (Vault, Consul, etc.) for service discovery, configuration management, etc.?

Genuinely curious about the load of managing this on the infrastructure team.

Yep, we run the full stack. Consul for service discovery and as the storage backend for Vault. We use Vault for config, PKI, Nomad/Consul ACL auth, and we're just starting to experiment with MSSQL dynamic credentials.

Of the three systems, Vault probably takes the most of our time and effort, and that's probably only a few hours per month. We've struggled a bit with performance at least partially because the Consul backend is shared with service discovery.

All of the VMs are built and managed with Terraform using images built with Packer+Ansible. We also use the Nomad/Consul/Vault Terraform providers to apply and manage their configurations.

We have an SRE/Platform Engineering team of 12 (and hiring) that's responsible for the overall orchestration platform additionally including Prometheus/Thanos/Grafana for metrics and ELK for logs.

Hope that's helpful!