Hacker News new | ask | show | jobs
by the_svd_doctor 953 days ago
In HPC it's common to do a mix of MPI (message-passing / distributed memory) and OpenMP (shared memory) parallelism when running on big multicore (and obviously multi-node) machines. It helps with locality, among other things.
1 comments

This is what I do actually, and it works fairly well. Currently I do one MPI process per socket, but mostly just because the OpenMP code I’m calling is a library, and it doesn’t seem to scale well past one modern Xeon worth of cores.

I don’t know what I’d do if I had an old Zen machine, maybe map an MPI process to each chiplett.

My impression is that in the first generation Zen machines, the cost of communicating from one chiplett to another was really quite significant, but they’ve made good enough progress there that it is only something that the really hardcode folks care about.