Hacker News new | ask | show | jobs
by cyberpunk 2010 days ago
In 2020 why aren't you packaging your apps as containers? Yeah, it sucks that ibm killed centos, but depending on some single distro's version of libm or libc or whatever is not their fault it's yours. Doing your job properly in this case means shipping you deps with your application, and the easiest way to do that these days is with containers..

Christ; it's either the 90s or kindergarten..

2 comments

Assuming based on GP that this is in a HPC environment, there is often a delineation between the people writing HPC software and the people maintaining the clusters and the software installed on them. Telling a brand-new graduate student with zero software development experience to just throw everything into a container results in running code that is not optimized for the hardware it's running on, which in turn negatively impacts the other users competing for compute time on HPC clusters.

There is a movement to incorporate technologies like Singularity into the HPC workflow but for established projects, it often looks like a lot of bikeshedding for negative results compared to just running the code on bare metal.

Because a cluster doesn't work like a normal computer.

Your users don't see the nodes. They submit jobs and wait for their turn in the cluster. A sophisticated resource planner / job scheduler tries to empty the queue while optimizing job placement so the system usage can be maximized as much as possible.

Also, users' jobs work in under their own users. You need to isolate them. Giving them access to docker or any root level container engine is completely removing UNIX user security and isolation model and running in Windows95 mode. This also compromises system security since everyone is practically root at that point. Singularity is user-mode and its usage is increasing but then comes the next point.

Performance and hardware access is critical in HPC. GPU and special HBAs like Infiniband requires direct access from processes to run at their maximum performance or work at all. GPU access is much more important than containerizing workloads. Docker GPU is here because nVidia wanted to containerize AI workloads on DGX/HGX systems. These technologies are maturing on HPC now.

In performance front, consider the following: If main loop of your computation loses a second due to these abstractions, considering this loops run thousand times per core on many nodes, lost productivity is eye-watering. My simple application computes 1.7 million integrations per second per core. So, for working on long problems, increasing this number is critical.

Last but not the least, some of the applications run on these systems are developed for 20 years now. So, these applications are not some simple code bases which are extremely tidy and neat. You can't know/guess how these applications behave before running them inside a container. As I've said, you need to be able to trust what you have too. So, we scientists and HPC administrators tend to walk slowly but surely.

Doing my job properly on the HPC side means my cluster works with utmost efficiency and bulletproof user isolation so people can trust the validity of their results and integrity of their privacy. Doing my job properly on the development side means that my code builds with minimum effort and with maximum performance on systems I support. HPC software is not a single service which works like a normal container workload. We need to evolve our software to run with minimum problems with containers and containers should evolve to accommodate our workloads, workflows and meet our other needs.

The cutting edge technology doesn't solve every problem with same elegance. Also we're not a set of lazy academics or sysadmins just because our systems work more traditionally.