Hacker News new | ask | show | jobs
by samps 5829 days ago
While you're right that the main problem has to do with sharing resources between different threads of execution, the difficult part is not actually doing that sharing. The simple act of sharing data is very simple, and can be accomplished via many different helpful abstractions (try looking at Wikipedia's description of "shared memory" or "message passing"). In the case of shared memory, sharing can be accomplished just by writing data to memory in one thread and reading it in another. Easy!

The difficult part is in how the threads actually coordinate. The problem is extremely application-specific (what exactly do threads need to share? When do they need to share it? These cannot be answered in a general way). It's generally accepted that concurrency bugs (examples: data races (colloquially "race conditions"), deadlock, atomicity violations, locking discipline violations) are extremely difficult bugs. This is probably either because (1) programmers are not accustomed to thinking about coordinating between parallel activities or (2) people are in just worse at thinking concurrently than thinking sequentially.

So new libraries/methods for accomplishing communication between threads are always welcome and can help reduce the complexity of parallel programming. However, nobody has yet found an abstraction that both works for most kinds of parallel programs (MapReduce is very simple to work with but also very restrictive) and is simple enough for people to program in without fear of hard-to-solve concurrency bugs (message passing and shared memory are both quite general but considered somewhat unsafe).

So, the problem is not that a good abstraction layer would be too computationally expensive -- it's that no one even knows what the abstraction should be! Hope this makes the issue clearer.

1 comments

Thanks, that does make a lot clearer.

It seems to me you and i are very comfortably coordinating sharing resources right now. In parallel, a browser is running on my computer, a browser is running on your computer, and an http server is running on the hn server.

But between us we're doing some collaborative. We are both contributing text out of which a single document is synthesized, and we might both have up voted this story, etc.

Our collaboration here is structured in terms of http requests/responses. Does this in itself address issues of "race conditions", "deadlocks", etc?

Can we imagine a future in which computation and memory are so abundant, we can virtualize this client/server paradigm for any collaborating parallel programs?

Or can we imagine a future in which there is no need to parallelize a large class of programs, because they will execute satisfyingly fast in a single thread?

Unfortunately, a client/server model still does not solve all of our concurrency problems. The problem is that the algorithm we're running--contributing a handful to text responses to the same repository--is very simple. But more complicated algorithms--say, if you're Google for instance, looking for a few words in billions of documents--need significantly more expertise to be correct (and perform well on top of that!).

It's certainly feasible to imagine that single-threaded performance will improve to the point that parallelism will no longer be "necessary" (although many people have observed that, with Moore's Law no longer yielding the performance improvements it did just a few years ago, this may be too far out). However, applications arise and expand to fill whatever performance we have available. By achieving parallel performance gains, we'll enable things that weren't possible before, no matter how good single-threaded performance is. Also important is the distinction between parallelism and concurrency: many domains need multiple threads for reasons that have nothing to do with performance! A database server, for instance, needs to service many requests concurrently; it can't function correctly in a "single-threaded" world.