Hacker News new | ask | show | jobs
by RootDynasty 4083 days ago
One thing to note is that the the parallel map consisting of two mapping operations has a different hidden overhead. If a message is received by a process that does not pattern match with any clauses in the receive block, the message is stored in a queue. When a new receive block is entered, all messages in the queue are pattern matched against the new receive block. In the worst case scenario, the worker process mapping the elements will finish in list order, so a great many pattern matches will be tried.

The solution to this problem is just to store messages as they are received inside of a map data structure, so that there is no overhead for receives. This requires indexing the list which makes the code a lot more inelegant.

Edit: Given n processors and an input of size n, I believe the time complexities are: Two map solution: O(n^2)+O(f) Fold solution: O(n^2)+O(f) Two map with map data structure: O(n)+O(f)

Where O(f) is the asymptotic upper bound of the function being mapped

Edit 2: This has me thinking about what will be the most efficient way to assign an element from the list to worker processes. In Erlang, it's clear that there isn't a way to avoid the O(n) overhead since the original process must reconstruct the list in order using cons.

In an imperative language this isn't necessary. Perhaps the original process can recursively assign indicies by spawning two children worker processes which are given a range of indicies to work on (who then create their own assignment processes). I believe the overhead is then only O(log n)...

1 comments

Just to comment, you're worrying about big O time complexities and that...really isn't what is going to dominate anything that is sufficiently complex to warrant a pmap rather than just a map. The constants on your computations will almost assuredly dwarf it.

The only O(n) operations are, yes, a completely degenerate case of when messages get sent back (which is -incredibly- unlikely; with Erlang's task scheduling you're likely only going to ever have a max message queue length of a few items, so it's more likely a constant factor. To get the degenerate case you would need it to finish in -reverse- list order, that is, would need to finish with the last one first, then the next to last one, etc), and when reversing the list(s) built up from the map at the end (as under the covers I'm pretty sure map is written to be tail recursive), which while technically O(n), is still incredibly fast.

I agree that worrying about the time complexity of the non-parallel portions of pmap is unlikely to be an issue for most use cases. It's still interesting to think about the tradeoffs though.

Hitting the degenerate case depends on the function you're computing in question. It's quite possible that the tasks will complete in the given order. I think you're giving too much credit to Erlang's task scheduler.

Also I'm not sure how one would even implement a tail recursive map function on a singly linked list. The cons operation can only add elements to the front of the list. I looked up how the map operation is implemented in Erlang. It isn't tail recursive:

https://github.com/erlang/otp/blob/172e812c491680fbb175f56f7...

I'm interested to know how you'd implement a tail-recursive version of map (continuations aren't allowed).

Interesting that it's not; I would have expected it to be. Given that, though, it's spawning tasks in list order, then receiving in list order; if the tasks complete in the order you spawned them, the first thing in the queue/arriving is always the item you're receiving on. That's the ideal case; more likely they'd be nearly in the order you spawned them, in which case you'd only have a few items to check through before you found the one you're receiving on.

I'm not saying the task scheduler is perfect, but I'd be really, really weirded out if it gave priority to the final process spawned, and worked its way backwards, which would be necessary for the degenerate case (that is, we spawned off items 1,2,3,4 in that order, but they completed 4,3,2,1. I would expect them to finish in close to 1,2,3,4 order, which would leave it at O(1) on each receive).

I'd implement a tail recursive map as -

  map(F, L) -> map(F, L, []).

  map(_, [], Acc) -> lists:reverse(Acc);
  map(F, [H|T], Acc) -> map(F, T, [F(H) | Acc]).