| HN Mirror

In one of these simulations, the logic is something like "calculate all the new positions and velocities of the molecules in a box, updated by one micro time step." It's implemented by updating a bunch of matrices in-place by mutation, often using linear algebra subroutines provided by a high performance BLAS like OpenBLAS or the one from Intel's MKL. The matrices are large enough that you'll have to page to disk if you return a new matrix from an old matrix instead of updating the matrix in-place. That's why requiring more memory makes it slower.

At the hardware level, transforming a numerical array M1 into M2 by mutation in-place requires a certain number of memory reads and writes. Leaving the old matrix M1 untouched and returning M2 as a new value (in the immutable style) means that you require at least as many memory accesses as before, but cache locality is likely worse because you're not updating the same address region you are reading from. So I'd expect numerical simulations based on immutable programming techniques to run slower than current simulation tools (like NAMD, AMBER, GROMACS) that perform mutation in-place.