Hacker News new | ask | show | jobs
by omneity 1957 days ago
We're using Nomad to power several clusters at Monitoro[0].

We're a super small team and I have to say our experience operating Nomad in production has been extremely pleasant - we're big fans of Hashicorp's engineering and the quality of their official docs.

Our biggest gripe is the lack of managed Nomad offerings such as GKE (on google cloud). However once things are set up and running it requires minimal to no maintenance.

I also run it locally for all sorts of services with a very similar experience.

As another comment mentioned, it's more of a better / distributed scheduler such as systemd. The ecosystem of tools around it is the cherry on top (terraform, consul, fabio ...)

[0]: https://monitoro.xyz

1 comments

The selling point of GKE etc. is “minimal to no maintenance,” but of course somebody else is doing the maintenance and the customer is paying a premium for it. Says great things about Nomad.
Yeah, when making the decision it was quite harrowing to think of maintaining a cluster in production. Nomad had very little operational complexity compared to what we imagined.

We've had two main outages in months:

- Server disks were filling up and we hadn't set up monitoring properly at the time (ironic for the name of our company :) ). Not Nomad's fault.

- A faulty healthcheck caused all the servers of a cluster to restart at the same time, which caused complete loss of the cluster state (so all the jobs were gone. I like to call it a collective amnesia of the servers).

We're still looking for a good/reliable logging and tracing solution though. Nomad has a great dashboard, but only with basic logging, and it only gets you so far.

Overall, would recommend again!

Jaeger is pretty great for tracing, and can integrate with Traefik/Envoy ( or whatever you use for ingress/inter-service communication).

We're running Loki for the logs ( via nomad log forwared/shipper and promtail) and so far it's going great. I'll have to do a write-up about the the whole thing.

Thank you for the pointers, very helpful. I'd love to see that write up too!
I'd love to see your write-up on thr logging thing. Please do!
Would love to see that write-up!