How to Think about Parallel Programming: Not! [video] (2021)

Y	Hacker News new \| ask \| show \| jobs

	How to Think about Parallel Programming: Not! [video] (2021) (infoq.com)
	21 points by caned 359 days ago

2 comments

fifilura 356 days ago

Disclaimer: at work so I didn't watch the video.

For loops are the "goto":s of the parallel programming era.

Ditch them and the rest can be handled by the programming language abstraction.

Why? Because they 1. Enforce order of execution and 2. Allow breaking computation after a certain number of iterations.

link

bee_rider 356 days ago

I’ve always been surprised that we don’t have a really widely supported construct in programming that is like a for loop, but with no dependency allowed between iterations. It would be convenient for stuff like multi-core parallelism… and also for stuff like out of order execution!

Not sure how “break” would be interpreted in this context. Maybe it should make the program crash, or it could be equivalent to “continue” (in the programming model, all of the iterations would be happening in parallel anyway).

I vaguely feel like “for” would actually have been the best English word for this construct, if we stripped out the existing programming context. I mean, if somebody post gives you instructions like:

For each postcard, sign your name and put it in an envelope

You don’t expect there to be any non-trivial dependencies between iterations, right? Although, we don’t often give each other complex programs in English, so maybe the opportunity for non-trivial dependencies just doesn’t really arise anyway…

In math, usually when you encounter “for,” it is being applied to a whole set of things without any loop dependency implied (for all x in X, x has some property). But maybe that’s just an artifact of there being less of a procedural bias in math…

link

vlovich123 356 days ago

We actually do have the abstractions but the problem is that the vast majority of for loops don’t benefit - you need to have so much work that the overhead of coordinating the threads is sufficient. Additionally, you’ve got all sorts of secondary effects like cache write contention that will fight any win you try to extract out of for loops parallelism. What we’ve been learning for a long time as an industry is that you benefit most from task level parallelism with minimal to no synchronization.

link

dkarl 356 days ago

Granted this probably isn't the parallel application that the other poster was envisioning, but it can be extremely useful when a computation depends on a large number of I/O-bound tasks that may fail, like when you are servicing a request with a high fan-out to other services, and you need to respond in a fixed time with the best information you have.

For example, if you need to respond to a request in 100ms and it depends on 100 service calls, you can make 100 calls with a 80ms timeout; get 90 quick responses, including two transient errors, and immediately retry the errors; get eight more successful responses and two timeouts; and then send the response within the SLA using the 98 responses you received.

link

vlovich123 356 days ago

That doesn't require parallelism, just concurrency. But yes, you'd use a similar task-local map/reduce construct to express doing a bunch of concurrent I/O in parallel (spawning each I/O on a separate thread would be counter-productive & a hack to enable not adding an event loop / async I/O).

link

fifilura 355 days ago

Let the engine or compiler decide if it is small enought to run on one core.

link

bee_rider 355 days ago

I’m only familiar with Fortran and my openMP is a little rusty. But, I think there are different pragmas for vectorization or threading. So, you have to tell it to do one or the other (is that wrong). Instead of expressing “well we don’t have dependencies here so do what you will.”

link

vlovich123 355 days ago

That would be great if the engine or compiler had that kind of capability but building that requires solving the halting problem.

Even if you try to do it with heuristics, go ask Itanium how that worked out for them and they tried a much simpler problem than what you’re proposing.

link

fifilura 355 days ago

I am not familiar with the Itanium story and I don't know who to ask.

But it seems to me like this would be a safe space to experiment. With heuristics and pragmas as a fallback. Because with the right approach solutions would mostly be better than not doing anything.

And you could do it in runtime when you know the size of the input.

And what about applying the logic to places where you can see that the loop will end?

I believe query planners in for example Trino/BigQuery do this already?

link

mystified5016 356 days ago

> like a for loop, but with no dependency allowed between iterations

"Break" is a dependency between iterations, and really only makes sense in a sequential iteration. In a parallel for loop, you can break from the current iteration, but the next is probably already running.

If you want any iteration to be able to cancel all others, they have to be linked somehow. Giving every task a shared cancellation token might be simplest. Or you turn your for loop into a sort of task pool that intelligently herds threads in the background and can consume and relay cancellation requests.

But I agree, we need a new paradigm for parallel programming. For loops just don't cut it, despite being one of the most natrual-feeling programming concepts.

C#'s Parallel.For and ForEach are a step in the right direction, but very unergonomic and unintuitive. I think we could get by with just bolting parallelism onto for loops, but we need a fundamentally parallel concept. I assume it'd look something like cuda programming but I really don't know.

link

fifilura 356 days ago

I believe this lecture is universal för both SQL and CUDA.

https://gfxcourses.stanford.edu/cs149/fall24/lecture/datapar...

link

mannykannot 356 days ago

The tricky cases are the very many where there are dependencies between iterations, but not demanding the strict serialization that a simple loop enforces. We have constructs for that, but there's an irreducible complexity to using them correctly.

link

two_handfuls 356 days ago

They're not in the language proper, but "parallel for" is a common construct. I've seen it in C# and Rust, but I'm sure other languages have it too.

It may be a good idea to use a framework with explicitly stateless "tasks" and an orchestrator (parallel, distributed, or both). This is what Spark, Tensorflow, Beam and others do. Those will have a "parallel for" as well, but now in addition to threads you can use remote computers as well with a configuration change.

link

bee_rider 356 days ago

The big C and Fortran compilers have openMP support, which includes parallel for loops. They just feel kind of… bolted on, being a pragma based language extension. And what I really want to express to the thing isn’t “fork here” but “here are some independent operations, tell the optimizing compiler about it,” and then the optimizing compiler can (among other transformations also decide to sprinkle some threads in there)

link

epgui 356 days ago

> we don’t have a really widely supported construct in programming that is like a for loop, but with no dependency allowed between iterations

Uhhh... we don't? It seems to me like we do. This is a solved problem. Depending on what you're trying to do, there's map, reduce, comprehensions, etc.

link

dkarl 356 days ago

And for those who also don't want to be forced to sequence the computations, i.e., wanting to run them concurrently and potentially in parallel, each approach to concurrency supports its own version of this.

For example, choosing Scala on the JVM because that's what I know best, the language provides a rich set of maps, folds, etc., and the major libraries for different approaches to concurrency (futures, actors, effect systems) all provide ways to transform a collection of computations into a collection of concurrent operations.

Curious if the poster who said "we don't have a really widely supported construct" works in a language that lacks a rich concurrency ecosystem or if they want support baked into their language.

link

epgui 356 days ago

That's right-- Personally I like to think in functional programming terms, and with FP concurrency/parallelism is more or less a non-issue.

link

Weryj 356 days ago

Sounds like you're talking about Realtime operating systems. I don't know if there are many/any programming languages that build those operational requirements into the syntax/abstraction.

link

Jtsummers 356 days ago

(2011) and was submitted back then: https://news.ycombinator.com/item?id=2105661

link