| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dragontamer 2053 days ago

> My main point is it's actually the case that a copy ties mutation in speed quite often if you structure the flow of your data correctly. If you're operating on a value, writing it somewhere else is often close to free.

So, here's something. I think that a copying-based methodology can be fast, but more importantly, a copying-based methodology can be easier to parallelize.

If a mutating state is too hard to parallelize, I think rewriting things to be copy-based, and then using that copy-based methodology as a basis for parallel code, is a good idea.

What's difficult is that "mutating state" is a local maximum so to speak: probably the fastest that any single-thread can get to. From this perspective, finding parallelism from a "higher level" can be more useful than trying to parallelize a well-optimized single-threaded program.

----------

Case in point: Multithreaded producer-consumer queues are difficult to write and even define. But multithreaded producer-producer queues (and consumer-consumer queues) are very easy: just an obvious "atomic_add(tail, 1)" (Producer) and "atomic_subtract(tail, 1)" (for consumers) kind of thing.

I call it a "producer-producer" queue, because you need assurances that no one is consuming from the queue for it to work. But if you synchronize your threads to all copy to a queue (and no one consumes from it), its really fast, and actually very parallelized. Ditto for the reverse (the consume phase).