Hacker News new | ask | show | jobs
by danpalmer 3440 days ago
I never got into high performance scientific computing, but I believe the stuff that was done in my department at university was all MPI based and required very high interconnect speeds (like with Infiniband). It looks like your offering is much more standard, what's the thinking there, or am I just wrong/out of date?
2 comments

It depends heavily on the kind of work. If you have a large scale simulation that needs to be partitioned like a weather system you are IO bound and need as thick interconnects as possible. However there are some problems which are very hard computationally but not very large. Basically everything in NP and exp is a good candidate. There you can distribute the same problem to a bazillion systems with a different starting configuration and let them run until one of them obtains a solution.

If you look at the BOINC project those are basically all problems of this kind. Folding proteins like folding@home does for example. The description of a protein is fairly small, a couple megabyte max. However it takes a long time to simulate the behaviour, since chemistry is a messy probabilistic process with lots of back and forth. Nature does this on trillions of proteins at the same time within nanoseconds, and while we cannot reasonably increase the simulation speed of an individual protein, we can at least simulate as many proteins at once as possible.

An important secret in HPC is that MPI is rarely required to achieve your objectives. In many ways, vendors just use MPI as a way to sell expensive systems. If you can find any way to make your system scale using threads on a single machine, or use non-latency-sensitive networking, do so.
If you don't need a high-speed interconnect, you don't need HPC. That's not to say that MPI per se must always be involved, but if for instance the 10gbit connection on Amazon's half-baked "HPC" offering is sufficient, then you definitely don't need a supercomputer.

There is a ton of important scientific work waiting for core hours that really shouldn't be. A loosely-connected grid of laptops would serve a lot of projects very well. On the other hand, there is a large body of work that does require a classical supercomputer, so it doesn't really do anyone any good to accuse MPI of being a sales gimmick.

There is plenty of HPC that does not need interconnect. It's false, categorically, to say that HPC requires interconnect of any kind.

An isolated, off-net computer - even a desktop PC- stuffed to the gills with GPUs can do HPC. On the other hand, machines connected with 10gbit might do HPC, but you'll have trouble getting codes to scale in a way that is "high performance", relative to what you can get out of threading on a single machine, or a small number of GPUs.

Very little work truly requires classic supercomputers or MPI- there are very few codes where an important engineering problem must be run on a system with low latency, high bandwidth.

Or rent a bigger AWS/EC2 instance to prepare for the eventual demise of old school HPC