Hacker News new | ask | show | jobs
by MR4D 971 days ago
I would upvote this 100 times if I could.

I've thought this way ever since MacOS added the Grand Central Dispatch [1]. Of course, I thought industry would follow quickly and that tooling would coalesce around this concept pretty quickly. Seems the industry wants to take its sweet time.

[1] - https://en.wikipedia.org/wiki/Grand_Central_Dispatch

3 comments

I mean, OpenMP dates back to 1997 (1998 for C and C++). Apple, however, has never supported it for what can only be selfish reasons (particularly since Clang has a quite good implementation provided by Intel, which can easily be installed on a Mac if you want). GCD came a decade later.

For basic parallelism, nothing beats OpenMP for ease of adapting existing code (often a single "#pragma omp parallel for" directive is enough). Even for more complex parallelism, particularly where per-thread resources need to be managed, OpenMP still provides a much simpler programming model than the alternatives.

OpenMP and GCD solve different problems. You'd not want to use GCD to parallelize the same tasks you're parallelizing with OpenMP in most cases. GCD is more suited for the one-off cases (toss this task into the queue, toss that task into the queue; or "as we get new items from the user toss the processing into the queue" but we don't know the rate of new items coming in so batching doesn't make as much sense), vice OpenMP which is targeting things like scientific computing/simulations where you know you have a million objects you want to perform a computation on. The GCD version of the same would be slower by a large measure if you spawned a task per work item or you'd recreate parts of OpenMP to divide the work across a smaller number of tasks. And you wouldn't want to use OpenMP for parallelizing the kind of things you toss into a work queue model like GCD offers.
Sure, OpenMP and GCD provide different interfaces around the same concept of a managed threadpool. Given both, one would use them for different tasks (in the same way one actually uses OpenMP and std::async for complementary purposes). But in the context of GP's basic parallelelized for/map/reduce operations, either can be used fine (although OpenMP would probably be more pleasant to write).
I'm not familiar with GCD but after reading the wiki page, I'd say most languages have something like that: a queue that you can add things to, items can then be processed by multiple threads of execution.

I'd also say that most languages have something similar to OpenMP, parallel for loops, etc. Great if you have some read only data in arrays and wish to process it.

However, in my opinion, it doesn't really matter how convenient a parallel / async programming model is to use as the real work is ensuring that there isn't any shared mutable state being updated in parallel. The other issue is, once you have formulated / re-formulated a particular problem to this model, ensuring that it remains this way is pretty challenging on larger teams. Someone can easily unknowingly commit something that breaks such assumptions.

At the end of the day, no matter what kind of code you are writing, you either have tools and processes in place that reduce the risk/mitigate the impacts of bugs, or there is always the risk of serious problems being introduced. An unknowing change that breaks parallelism in another component could just as well be an unknowing change that breaks authentication or defeats a security boundary.

Parallelism introduces an additional class of bugs, but they are fundamentally addressed the same way as any other class of bugs - e.g. testing, tools, and code review. If some_one_ can unknowingly break a system, that means the tools and processes weren't good enough.

One difference from most other classes of bugs is that threading issues can be quite nondeterministic, which makes it harder to automatically disambiguate between flaky tests and real bugs being caught.

Also, the code introducing a race condition may get lucky when your CI system runs the tests and still make it into your main branch.

I agree that tooling (like static analysis, Rust's borrow-checker, etc) can play a big role here though.

That is the issue. It is very hard to write tests that ensure correct parallel code as it can easily work 99.9% of the time. This is not the case with typical functional requirements.
I've played around with actors in Swift for shared mutable state, which enforces async access patterns.

https://www.swiftbysundell.com/articles/swift-actors/

It’s been a while but isn’t GCD an OS global queue rather than local to the process?
Each process has its own GCD queue hierarchy that are executed by an in-process thread pool. Though it has some bits coupled with the kernel for stuff like Queue/Task QoS class -> Darwin thread QoS class and relatedly priority inversion.
Back then it wasn't even sure its adoption would take off.

My HPC programming lectures were done on PVM, and I bet only grey beards know what it stands for.

> Seems the industry wants to take its sweet time.

We're inching towards an in vogue way to do what erlang had figured out in the 80's. We'll pick up the pace any day now. Surely.

Grand Central Dispatch is famous for breaking tons of old programs for the grave sin of trying to `fork` a child process.