Hacker News new | ask | show | jobs
by bayindirh 65 days ago
Actually, C, FORTRAN and C++ are friendly to memory bandwidth, written correctly.

C++ is better than FORTRAN, because while it's being still developed and quite fast doing other things that core FORTRAN is good at is hard. At the end of the day, it computes and works well with MPI. That's mostly all.

C++ is better than C, because it can accommodate C code inside and has much more convenience functions and libraries around and modern C++ can be written more concisely than C, with minimal or no added overhead.

Also, all three languages are studied so well that advanced programmers can look a piece of code and say that "I can fix that into the cache, that'll work, that's fine".

"More modern" programming languages really solve no urgent problems in HPC space and current code works quite well there.

Reported from another HPC datacenter somewhere in the universe.

1 comments

I suppose that most HPC problems are embarrassingly parallelâ„¢, and have very little if any mutable shared state?
I'd say that the opposite is more often the reality, which is why HPC systems tend to have high-bandwidth, low-latency networks.
High bandwidth may mean the need to consult some very large but immutable data structure. As a trivial example, multiplying two matrices requires accessing each matrix fully multiple times over, but neither of them is altered in the process, so it can safely be done in parallel. Recording the result of a (naive) matrix multiplication can also be done without programmatic coordination, because each element is only updated once, independently from others.

This is very unlike, say, a database engine, where mutations occur all the time and may come from multiple threads.

Rust specifically makes it hard to impossible to clobber shared mutable state, e.g. to produce a dangling pointer. But this is not a problem that our matrix-multiplication example would have, so it won't benefit from being implemented in Rust. Maybe this applies to more classes of HPC problems.

The HPC infrastructure is not like you're used to using. It is very high bandwidth but latency is dependent on where your data lives. There's a lot more layers that complicate things and each layer has a very different I/O speed

https://extremecomputingtraining.anl.gov/sites/atpesc/files/...

Also how to handle the data can be very different. Just see how libraries like this work. They take advantage of those burst buffers and try to minimize what's being pulled from storage. Though there's a lot of memory management in the code people write to do all this complex stuff you need so that you aren't waiting around for disks... or worse... tape

https://adios-io.org/applications/

On the contrary. However, they tend to manually manage memory rather than outsourcing it to a language runtime or a distributed key-value store.