| hmm, I'd like to digress >- Lots of different kinds of nodes well, that's not a problem of slurm (which will happily start your process on all nodes), but of typical MPI programming. And once you are running something computationally intensive over multiple nodes today, you are still using MPI. >- anything more complex dependency wise than a handful of shared Conda envs you can put whatever dependencies you want on your NFS (or copy them to your node). If you're running on a single node it behaves 100% like running with a special login shell on os XYZ, so I don't know what problems happen with dependencies. The main problem would be that it doesn't include any "service discovery" beyond OpenMPI. >- anything involving docker have not used it, but there's enroot/singularity. The first of which is apparently dogfooded at Nvidia. Probably might need some adjustements for bases images (because MPI)... As I don't know about the policy within these 5k+ cloud companies: can employees just execute any random image from dockerhub there? This seems a little dangerous... > anything vaguely untrusted linked to the docker case? Does kubernetes reboot nodes then? Slurm can do this. And while classical Slurm use cases definitely require a shared account (because of the shared fs), slurm should afaik merrily execute your programs even without any shared account than slurm. You can attack this obviously, but so you can attack kubernetes and while it gets more scrutiny it's also a byzantine collection of FANG-style requirements. EDIT: What you can't work around is Slurm needing a comms-channel back to the controller, which you though could just firewall off (jobs don't use Slurm to communicate...). As each job can execute a Prolog-script, you can even only selectively allow traffic to flow between allocated nodes quite simply. >- any kind of partitioning worse than 3 nines e.g. connectivity or uptime instability that's indeed the case >- anything more complex than 3-5 priority levels of scheduling what kind of scheduling does kubernetes implement? I guess you could write a plugin for slurm doing that > It's great if you hit that niche but it frankly struggles with the complexities of even moderately heterogeneous work loads. except that your points didn't pertain to this (except maybe for the dependencies, if you think about actual service-dependencies), I fully agree |
> you can put whatever dependencies you want on your NFS (or copy them to your node).
This is exactly what we do currently. For non controlled data, this works. However this gets really thorny when you involve CUI (confidential unclassified information), precisely because of mentioned shared fs.
Both SLURM and Kube let you write schedulers but just getting SLURM to talk to the DB was a tough affair, some very poorly documented bugs were at play.
I haven't been on this project in a bit so I don't recall the exact details. And maybe it's lack of familiar with SLURM. But I definitely felt hobbled by it. We are probably going to something based off of Hashicorp stuff.