Hacker News new | ask | show | jobs
by lmeyerov 2487 days ago
Fairly similar. Docker/docker-compose takes care of launch, healthchecks / soft restarts, replication, GPU virtualization & isolation, log forwarding, status checks, and a bunch of other things. Most of our users end up on-prem, so the result is that _customers_ can admin relatively easily, not just us, despite weird stuff like use of GPU virtualization. I've had to debug folks in airgapped rooms over the phone: ops simplicity is awesome.

Some key things beyond your list from an ops view:

-- containers/yml parameterized by version tag (and most things we gen): simplifies a lot

-- packer + ~50 line shell script for airgapped tarball/AMI/VM/etc. generation + tagged git/binary store copies = on-prem + multiple private cloud releases are now at a biweekly cadence, and for our cloud settings, turning on an AMI will autolaunch it

-- low down-time system upgrades are now basically launching new instances (auto-healthcheck) + running a small data propagation, and upon success, it dns flips.

-- That same script will next turn into our auto-updater on-prem / private cloud users without much difference. They generally are single-node, which `docker-compose -p` solves.

-- secrets are a bit wonkier, but essentially docker-compose passes .envs, and dev uses keybase (= encrypted gitfs) and prod is something else

Some cool things around GPUs happening that I can't talk about for a bit unfortunately, and supporting the dev side is a longer story.

Some of these patterns and the tools involved are normal part of the k8s life... which is my point: going incrementally from docker / docker-compose or equiv lightweight tooling will save your team + business time / money / heartache. Sometimes it's worth blowing months/years/millions and taxing the folks who'd be otherwise uninvolved, but easily for over half the folks out there, probably 90%+.. so not worth it. Instead, as we need a thing, you can see how we incrementally add it from a as-simple-as-possible baseline.