Hacker News new | ask | show | jobs
by physicsguy 644 days ago
Documentation is mixed but it’s usually similar between clusters.

You typically write a bash script with some metadata in rows at the top that say how many nodes, how many cores on those nodes you want, and what if any accelerator hardware you need.

Then typically it’s just setting up the environment to run your software. On most supercomputers you need to use environment modules (´module load gcc@10.4’) to load up compilers, parallelism libraries, and software, etc. You can sometimes set this stuff up on the login node to to try out and make sure things work, but generally you’ll get an angry email if you run processed for more than 10 minutes because login nodes are a shared resource.

There’s a tension because it’s often difficult to get this right, and people often want to do things like ´pip install <package>’ but you can leave a lot of performance on the table because pre-compiled software usually targets lowest common denominator systems rather than high end ones. But cluster admins can’t install every Python package ever and precompile it. Easybuild and Spack aim to be package managers that make this easier.

Source: worked in HPC in physics and then worked at a University cluster supporting users doing exactly this sort of thing.