| > DO NOT EVER MUTATE STATE! This is counterproductive. Mutating state is by far, the most efficient operation on the computer. Copying state means (usually) calling the memory allocator, which (usually) must be done in a sequential manner. There are optimizations to allow memory-allocation to occur in parallel, but... things start to get inefficient (use of thread-local storage, which makes many pointer-indirections, etc. etc.). ------- Why do people program parallel code, despite its overwhelming complexity? Simple: to be faster. That's it. Its an optimization, arguably premature, but the programmer reaching for the "parallel" button is going for optimization for a reason. ------ Sequential code with state-mutations could very well be more efficient than parallel code with tons of unnecessary copies. Registers are fast, L1 cache is fast, etc. etc. Doing an operation, then undoing it (entirely inside of register space) is so incredibly fast, it will make your head spin. ------- 1: Write code. 2. If code isn't fast enough, single-thread optimize it. This usually means mutating state in an intelligent manner. 3. If code is STILL not fast enough, multithread it. You make a copy of the state, and then have multiple threads (each with their own copy of the state) doing... whatever you're trying to do. |
Really? I mean like pretty much every instruction has separate source and destination addresses/registers, memory caches obviously complicate things, but it seems that processing data from one location to another is the most efficient operation.
> Copying state means (usually) calling the memory allocator
Sure, I guess if the main conception of moving data around involves not really knowing where it needs to go, and having to figure that out, that's somewhat true. But there are many ways to structure computation that minimizes or avoids allocator usage.
> unnecessary copies
I think this is the issue, if you have to copy first then do in place mutations, of course it'll be slower, but you can instead structure your entire computation to process your data from one location to another, then all your points about registers, and L1 cache being fast, holds true while still being able to handle immutable data