Hacker News new | ask | show | jobs
by jcranmer 973 days ago
Proving legality of transformations in the compiler is frequently impossible. Consequently, the main mode of implementation has been to essentially think of the problems in terms of the user saying that this loop is parallel, please make it run in parallel. OpenMP or Rust's rayon crate, for example. The other similar innovation has been programming SIMD as if each lane were an independent thread, which is essentially the model of ispc or CUDA (or #pragma omp simd, natch).

The other big impossible task is that most code isn't written to be able to take advantage of theoretical autoparallelization--you really want data to be in struct-of-arrays format, but most code tends to be written in array-of-struct format. This means that vectorization cost model (even if proven, whether by user assertion or sufficiently smart compiler, legal) sees it needs to do a lot of gathers and scatters and gives up on a viable path to vectorization really quickly.

1 comments

Maybe some history will help here too. In the 90s, the data model of most programming languages wasnt even array-of-structs, but array-of-pointers to otehr pointers to other pointers...

And the majority of software we've inherited is written this way.

In the 90s this didnt matter, since dereferencing a pointer was comparably expensive to arithmetical operations. But with modern CPUs with massive caches and more native parallelisation, the difference is dramatic.

So, even now, the majority of languages we're using; and almost all code we've inherited today, are as far away as you can get from efficiently using modern CPUs.

The task is first to change all these languages to enable ergonomic programming without tons of indirection -- we're very far away from even providing basic tools for performant code