Hacker News new | ask | show | jobs
by cpach 1104 days ago
Correct me if I’m wrong, but isn’t concurrency enough for “most” use-cases? When does one really need true parallelism?
3 comments

Concurrency is typically a great solution of IO bound tasks, which is why it figures so prominently in modern day webdev. It's also essential for any UIs in which case you don't want a single threaded process to be blocking user interaction just because it's running a task that takes time to complete.

Parallelism is used to solve CPU bound problems. Obvious examples are things like efficient matrix multiplication. However this is what make Parallel programming so hard. There's a lot of nuance working effectively with multiple cores/threads that makes even "embarrassingly parallel" problems sometimes fail to see benefits from naive parallel programming (for example if your parallel solution requires something as little as more frequent visits to l2 cache instead of l1 you can see performance degradation).

So parallelism and concurrency ultimately solve two very different domains of problems.

I don't think IO bound and CPU bound problems are the correct way of differentiating between the two. Parallel programming deals with real-time problems with data dependencies and synchronization requirements. While concurrent programming deals with independent computations, where progress or completion of one does not depend on the progress of another.

I agree however they do ultimateively solve two very different domains of problems and there is a lot of confusion going on here.

On a side note: matrix multiplcation is generally not a CPU bound problem, but a memory bound one.

Parallelism is only about performance, that's it. If you need something to go faster, parallelism is an option.

Looking into the future, parallelism is one of the only remaining techniques for scaling classical computing. Processors have stopped getting faster. Instead, they're getting fatter (more cores).

Concurrenceny deals with what can be parallel, because they don't share data dependencies. For example, a map() function is trivially a statement of concurrency: each item needs to have a function applied to it, and assuming pure functions, that can all be done concurrently.

Parallel computing cares about what should be parallel, i.e., actually implementing parallelism. For most programmers, this job can (and probably should) be left to the scheduler, whose job is to translate concurrency into reasonably sane parallelism.

The scheduler comes with overhead, though, that can be avoided with hand-spun parallelism. I like to think about it like manual memory management: for most programs and programmers, using a garbage collector that a memory management expert wrote is easier/better than manually allocating and freeing memory, but there are performance gains to be had if you don't.