| [Disclaimer: How do I know this stuff?
I'm a core developer of GROMACS, one of the major molecular dynamics simulation FOSS community codes. Molecular dynamic is the algorithm/method behind the F@H simulations.
I work as a researcher in high performance computing and have (co-)developed many of the parallel algorithms and GPU acceleration in GROMACS.] 20-200x is simply not true. Typically, such numbers are a result of comparing unoptimized CPU code to moderately or well-optimized GPU code which is often misleading. (Such differences are however perfectly reasonable when comparing hardware-accelerated workloads like ML/DL). If you compare actually well-optimized codes, you'll see more like ~4-5x difference in performance for FLOP/instrction-bound code as it is the case for well-optimized molecular dynamics. Case in point, I recently pointed out the huge difference in CPU performance of two of the top molecular simulation codes, one of which is 8-10x faster on CPUs than the other, solving the same problem [2]. F@H relies on GROMACS as a CPU engine [1] which happens to be the same code as I quoted above as the fast one. The trouble is that F@H has not updated their CPU engine for many years and distribute CPU binaries which lack crucial SIMD optimizations to allow making use of AVX2/AVX512 on modern x86 CPUs as well as the years of algorithmic improvement and code optimization we made. These two factors combined lead to _significantly_ lower F@H CPU performance compared to what they had we're they using a recent GROMACS engine. Consequently, due to the combination of an inherent performance advantage of GPUs and the severely outdated CPU engine, it is indeed not worth wasting energy with running F@H on CPUs. [1] https://en.wikipedia.org/wiki/List_of_Folding@home_cores#GRO... [2] https://twitter.com/twilard/status/1235142089156984832?s=20 Edit 1: adjusted wording to reflect that the performance difference between running outdated GROMACS version and subotimal SIMD optimizations on modern hw can have a range of performance difference, depending on hardware and inputs.
Edit 2: fixed typo + formatting. |