Hacker News new | ask | show | jobs
by slizard 2285 days ago
[Disclaimer: How do I know this stuff? I'm a core developer of GROMACS, one of the major molecular dynamics simulation FOSS community codes. Molecular dynamic is the algorithm/method behind the F@H simulations. I work as a researcher in high performance computing and have (co-)developed many of the parallel algorithms and GPU acceleration in GROMACS.]

20-200x is simply not true. Typically, such numbers are a result of comparing unoptimized CPU code to moderately or well-optimized GPU code which is often misleading. (Such differences are however perfectly reasonable when comparing hardware-accelerated workloads like ML/DL). If you compare actually well-optimized codes, you'll see more like ~4-5x difference in performance for FLOP/instrction-bound code as it is the case for well-optimized molecular dynamics.

Case in point, I recently pointed out the huge difference in CPU performance of two of the top molecular simulation codes, one of which is 8-10x faster on CPUs than the other, solving the same problem [2].

F@H relies on GROMACS as a CPU engine [1] which happens to be the same code as I quoted above as the fast one. The trouble is that F@H has not updated their CPU engine for many years and distribute CPU binaries which lack crucial SIMD optimizations to allow making use of AVX2/AVX512 on modern x86 CPUs as well as the years of algorithmic improvement and code optimization we made. These two factors combined lead to _significantly_ lower F@H CPU performance compared to what they had we're they using a recent GROMACS engine.

Consequently, due to the combination of an inherent performance advantage of GPUs and the severely outdated CPU engine, it is indeed not worth wasting energy with running F@H on CPUs.

[1] https://en.wikipedia.org/wiki/List_of_Folding@home_cores#GRO...

[2] https://twitter.com/twilard/status/1235142089156984832?s=20

Edit 1: adjusted wording to reflect that the performance difference between running outdated GROMACS version and subotimal SIMD optimizations on modern hw can have a range of performance difference, depending on hardware and inputs. Edit 2: fixed typo + formatting.

1 comments

It seems irresponsible to waste donated cycles like that. Does anyone else do it better?
> Does anyone else do it better?

To that question, assuming by "anyone" here you are asking about other donate-your-cycles-distributed-computing-projects: I am not too familiar with how well-optimized the codes of different @home projects are.

Taking a few steps back, perhaps the efficiency of these codes is the lesser issue and to be honest, in some (many?) cases other forms of donation/contribution may further more scientific progress than simply crunching numbers on one's home PC.

Yes, I'm referring to @home projects.

Totally, but if the work done @home is useful, donating compute time makes economical sense I think.

If I'm willing to donate $10 I can either donate money and it may be used to buy $10 worth of compute, with should cover all costs including the hardware and administration.

Or I can donate $10 worth of pure electricity and the other marginals I cover for no or a very small extra cost, since I already own the hardware for other purposes which it's temporarily not used for.

In the latter case the value of my $10 is higher, I theorize. Again, given that the @home project is truly useful.

Unfortunately typically there is no way to directly donate funds (especially not $10, even if there are 10000 people who'd do that on a monthly basis) to some research (group) of one's choosing. Therefore the choice is whether to donate to an @home project and trust that what they do is meaningful and the donated resources are used in a responsible manner.

In that respect, the responsibility of whether to ask for and how to make good use of donations lies solely on the teams that receive the donation. Without oversight it would however be foolish of them to be overly critical on their own shortcomings as there is a great benefit to having these cheap FLOPS (and good PR) that F@H brings.

I was about to suggest that it would be great to set up a merit-based funding scheme somewhat akin to the governamental funding agencies, but one run independently by the "council of the people". I'm however uncertain how effective could such an organization be at awarding the funding in a responsible and effective manner.

Can you please elaborate? Are you referring to the F@H CPU "cores" being outdated and inefficient?
Yes, if it's so. When donating money or compute time, I want my donation to be used efficiently and not be wasted by high overheads.
> When donating money or compute time, I want my donation to be used efficiently and not be wasted by high overheads.

Fair point. I think this is something you should bring up with the authors of Folding@Home. I do not work on that project.

My personal view on this is that there is always a cost/benefit balance that one has to strike which is often tricky especially given the considerably constrained resources as it is typically the case in academic computational/simulation tool development.

It is however, as you point out, a great responsibility of the researchers and developers of codes to make sure that choices made and action taken (or not) do not lead to disproportionate waste of resources donated by volunteers or awarded through a grant by research funding agencies.

I do not know the detailed reasons why F@H chose to not update CPU "core" since FahCore_a7 (AFAIK based one GROMACS code from 2014), but it is likely related to the aforementioned cost/benefit analysis done in their team. One of the motivations could have been (just hypothesizing) that the software engineering efforts estimated to be required to update FahCore for CPUs (which generate a very small fraction of the "points") would have taken away resources from the GPU FahCore.