Hacker News new | ask | show | jobs
by btrettel 2352 days ago
I disagree. I did my masters' on computational fluid dynamics (CFD) and I'd say that a large fraction of supercomputer use (in fluid dynamics at least) is wasted. Mostly because people take naive approaches and end up computing the wrong thing, set up their simulation poorly, reinvent the wheel, use HPC on something that can be computed by hand, etc. If they read more of the literature they'd have a more solid grasp on things and use the software much more efficiently when they do use it.

My philosophy at the moment is to use HPC only when I've exhausted other possibilities. I think many people jump to HPC prematurely. The simpler approaches are so much cheaper that I think it's usually worthwhile. I'm skeptical of the argument that it's cheaper to use HPC than it is to use more efficient methods in this case, because the more efficient methods are often something like a few days spent reading to find the right equation or existing experimental data vs. at least that much setting up a simulation and longer to run it.

Edit: Bill Rider has a bunch of blog posts that make similar points:

https://wjrider.wordpress.com/2016/06/27/we-have-already-los...

https://wjrider.wordpress.com/2015/12/25/the-unfortunate-myt...

https://wjrider.wordpress.com/2016/05/04/hpc-is-just-a-tool-...

https://wjrider.wordpress.com/2016/11/17/a-single-massive-ca...

https://wjrider.wordpress.com/2014/02/28/why-algorithms-and-...

3 comments

Many moons ago when Beowulf clusters were still new, I remember a project where I was given a months worth of then new cluster time to spend. Due to a delay we had to wait a few months before we could exercise it. One weekend I was playing around with some junk computers I'd assembled for a LAN party. Long story short, I tried out a portion of the project on that LAN, got a useful result then we ran the rest of the project on that overall system and joined more machines. We completed the project using no special machines, often just idle machines added and removed upon their availability. Along the way we rewrote the entire project code as a set of modules then used python to orchestrate them.

None of this is extraordinary now but the result was we reduced multiple times the budget requirements and processing times just using code improvements. A lot of times, the head-on solution just needed optimisation. Sideways improvements also helped such as small optimisations also helped. The more exotic equipment is still useful but its an accelerator.

One last thing: we received a LOT of grief and criticism for our approach. There was peer pressure to use particular solution types even though they were wildly inappropriate. We had funding pulled or threatened to be pulled by some and other backers. One lessons we learnt includes: Don't underestimate how vested certain interests are for the use of various toolkits. "Use this or else!" is the not-so-subtle threat.

I'm so glad to not rely on only academic work now.

This is _so_ true! I remember a few years ago interviewing for a position where the largest supercomputer in the area was being used to run Perl programs. Of course the place I ended up working was running Python at supercomputer scale and running out of I/O bandwidth because of all the naive Python runtime startup crud.
This may be true for some fields, but it's certainly not true for all fields. I'd be surprised if there are still orders-of-magnitude improvements to be discovered for BLAST, for example, and that's a foundational component of modern biology.

I should have spoken more generally to OP's point though. What I was really hoping to get at is that there are applications other than games that require non-trivial amounts of compute, and speeding them up would make meaningful differences in their users' lives.