Hacker News new | ask | show | jobs
by ktpsns 3140 days ago
Scientific high energy physicist here with regional HPC center on the same floor. My observation is that administrators tend to enterprise distributions such as Scientific linux, Suse linux enterprise server (SLES), together with commercial MPI implementations such as IBM MPI and Intel MPI.

On the other hand, people are used to Linux, in my environment literally everybody has Ubuntu on their notebook and workstation. They know how to run their python analysis scripts there and the only thing they have to change when going to the cluster is the adoption of an environment managament system (such as http://modules.sourceforge.net/).

(However, I have to admit I never got in touch with BSD and don't know the differences in user space)

2 comments

One thing very much worth following as a replacement or supplement to modules is Singularity. It is relatively easy to create images that contain all of your required libraries, and you can run these same containers on both your cluster and on your development system.

This can substantially reduce the time to deploy new software and cut down on overhead related to managing multiple modules.

Singularity, unlike Docker, is designed to require only minimal privilege escalation and as such it's an easy sell to HPC admins, who can (at least somewhat) get out of the business of helping users figure out what the heck is weird about their environment when trying to get something running on a cluster for the first time. You can also take these containers with you and be reasonable certain they'll work on another system.

http://singularity.lbl.gov/

Modules looks really interesting. Makes me wonder why Continuum is out there trying to reinvent the wheel with Anaconda. Glad to have something I can use at work to replace Conda environments. Now all it needs is Powershell/CMD support so I don't have to use it inside Cygwin...
Modules are really about environments (including software management). Anaconda doesn't handle this. For example, Conda its version of HDF5 and points to its environment path. Let's say you want to be using a different version of HDF5. An easy way to do this is just use a module so that you load this. You are creating an easy way for the user to set up their environment, where they really don't have to know anything about it.

It also helps with versioning. It is not uncommon to see various versions of gcc and intel compilers. In essence the user should be able to load their environment with a few module loads.

Here's some more info, if you're interested

[1]http://www.admin-magazine.com/HPC/Articles/Managing-the-Buil...

[2]http://www.admin-magazine.com/HPC/Articles/Managing-Cluster-...

[3]https://uisapp2.iu.edu/confluence-prd/pages/viewpage.action?...

These solve completely different problems. On my HPC I load the conda module to run Python.

They are really different tools for different jobs.