| > Really? I mean like pretty much every instruction has separate source and destination addresses/registers, memory caches obviously complicate things, but it seems that processing data from one location to another is the most efficient operation. I can't think of any algorithm where copying is faster than mutating. * Chess Bitboards: Make / Unmake is a methodology for a reason. * Mergesort: In-place mergesort is far faster than copy-merge sort. Its much harder to write in-place mergesort, but it really makes a big difference. * Quicksort: In-place quicksort is faster than copy quicksort. Every "pure functional" recursive / copy quicksort has been... subpar... in my experience. * C++11 created a "destructive move", the entire R-value references system, because of the hugely more efficient concept of destructive / mutating moves. Destroying the old data while copying the data to a new location (aka: std::move) is far more efficient than making a safe copy (aka: copy-constructor). ------- Side note: The SIMD "Mergepath" Mergesort is absolutely brilliant. It uses absolutely no mutexes (and only uses thread-barriers) to guarantee that all writes are independent and correct. Furthermore, "Mergepath" mergesort parallelizes maximally (one-element per processor), while retaining O(n * log(n)) sort speeds. Because thread-barriers are no-ops on a SIMD computer (like AVX512), the thread-barriers in the MergePath concept are maximally efficient. :-) ------- What algorithms are you thinking of where copying is faster than mutating? At best, a copy ties mutation in speed. In a huge number of applications, mutation (aka: std::move) just faster. But std::move cannot be done across multiple threads: that's innately thread-unsafe and must be synchronized (at least for typical objects). --------------- Or maybe the above is too abstract? Lets go down to a very simple, but specific, algorithm. The Fisher-Yates (aka: Knuth) Shuffle. https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle As far as I know, there is no copy-based algorithm that can beat the Knuth Shuffle's O(n) computation complexity. It is extremely difficult to write a O(n) speed shuffle without mutating the array in place. The fastest "copy" methodologies for the Shuffle problem I've seen involved generating a random number for each location, then copy-sorting the results: O (n * log(n)), which is asymptotically more complex than the Fisher Yates shuffle (and therefore: work-inefficient. If you parallelize to O(n*log(n)), there will be a size where the Fisher Yates shuffle is faster). |
Having said that, I hardly see a sort being the majority of time spent in real-world code.
> Because thread-barriers are no-ops on a SIMD computer
My understanding is that only holds true across the SIMD lane, and the algorithm doesn't hold up so generally to general purpose concurrency. But agreed, a sort is something I'd likely try to optimize on a single core, and express program concurrency outside that specific operation.
> What algorithms are you thinking of where copying is faster than mutating? At best, a copy ties mutation in speed.
My main point is it's actually the case that a copy ties mutation in speed quite often if you structure the flow of your data correctly. If you're operating on a value, writing it somewhere else is often close to free.
Having said that as an example, I've seen filtering a list run faster doing a move, highly dependent on the details of the cache of course.
> that can beat the Knuth Shuffle's O(n) computation complexity.
I haven't considered shuffles specifically. But getting into the rut of complexity evaluation alone dismisses important aspects of the machine and the data you are actually operating on, and can lead you very far from the fastest solution for the problem you are actually trying to solve.