|
|
|
|
|
by mononcqc
4084 days ago
|
|
This implementation does rely on a lot of memory copying going on there. Rather than copying the original element to the new process, then the new mapped element back to the original process (giving you the original list's worth of memory copied twice), you'll be copying each list element once to every worker you spawn, and then you'll copy the parts of the list worked on to the next worker, on and on, until the full list is finally copied back to the caller. So instead of copying 2n elements over the processes, you'll be copying roughly 2nlogn (or is it n(n+1)/2 ?) elements around. |
|
The solution to this problem is just to store messages as they are received inside of a map data structure, so that there is no overhead for receives. This requires indexing the list which makes the code a lot more inelegant.
Edit: Given n processors and an input of size n, I believe the time complexities are: Two map solution: O(n^2)+O(f) Fold solution: O(n^2)+O(f) Two map with map data structure: O(n)+O(f)
Where O(f) is the asymptotic upper bound of the function being mapped
Edit 2: This has me thinking about what will be the most efficient way to assign an element from the list to worker processes. In Erlang, it's clear that there isn't a way to avoid the O(n) overhead since the original process must reconstruct the list in order using cons.
In an imperative language this isn't necessary. Perhaps the original process can recursively assign indicies by spawning two children worker processes which are given a range of indicies to work on (who then create their own assignment processes). I believe the overhead is then only O(log n)...