Hacker News new | ask | show | jobs
by ochoseis 978 days ago
I’ve started playing with Nomad recently just to see what it’s all about. There are some annoying things with networking, but overall it’s been pretty fun so far. What made you consider it harmful?
1 comments

The number one issue is that deployments are blocking, and scaling is a deployment. That means during a deployment, scaling is disabled. This makes safely deploying to thousands of containers during scale-in hours extremely dangerous. Combined with nomad-autoscaler being a SPOF and crash-happy, the always-be-deploying startup with a daily traffic pattern should not use Nomad.

It’s also nontrivial to get nomad to be resilient. For example, if your template uses keys from consul or vault, even with “noop” mode, and consul or vault are unreachable, Nomad will happily start killing running containers that had been rendered correctly in the past, because it can’t re-render the template. This pattern has only recently been addressed by “nomad_retry” but there have been several bugs with it, and 1.6.2 will currently kill all running containers if some template resource can’t be reached. Under the hood this uses consul-template, which does support infinite retry, but getting nomad to use consul template safely is non trivial. Eg: vault tokens expire so infinite retry for vault doesn’t work as-is.

Node_pools just landed (2023!), and are still broken when using Levant (another abandoned nomad tool - think kustomize with a much more horrifying DSL).

Bonus issues: they’re still trying to get cgroups v2 working, the current version finally doesn’t DOS your backend services in an infinite loop, the UI lies, deployments can get stuck forever because “progress_deadline” is more of a suggestion than a deadline, nomad-autoscaler is not highly available, crashes very easily, often scales faster than its “cooldown” window, and is simply stupid. AWS Karpenter feels a full decade into the future.

And all that so I can write my own Nomad specs instead of installing a vendors helm chart in a network isolated namespace. Boo.

To reiterate, I’ve liked Hashi for years and years, and I don’t have any ill will towards them. Just shocked at how poor nomad is compared to k8s. It’s definitely more fun than k8s when getting started, for sure.