Hacker News new | ask | show | jobs
by chrisseaton 2971 days ago
It's not my definition! And you're right, normal message passing does leave you vulnerable to race conditions and your program can run a different way each time you run it! That's a major problem with it.

That's why I think other models of concurrency, such as the fork-join model, where the equivalent of 'messages' have to arrive in a deterministic order and so there are no race conditions, are safer.

1 comments

Message passing forces you to explicitly handle non-deterministic order, how can it leave you vulnerable to race conditions? If you need to receive a specific message first, you wait for that specific message, that's it. Simple and deterministic.
This is a real error I've seen someone make using Erlang in making a parallel sort when teaching parallel programming to masters students.

    actor A {
      receive half an array
      sort it
      send it to C
    }

    actor B {
      receive half an array
      sort it
      send it to C
    }

    actor C {
      (a, b) = divide input array into halves
      send a to A
      send b to B
      receive a'
      receive b'
      merge a', b'
    }
They send to A, send to B, and then think they're going to receive from A first, because they sent first, but B could finish first instead. Sometimes their program works, sometimes it doesn't.

Yeah it's their fault, but the model hasn't helped them not make the bug and worse they may never see the bug until they deploy into production.

If we used a fork-join model, they could not have made this mistake, and even if they did make some kind of mistake, at least they'd see it every time they ran their program.

    (a, b) = divide input array into halves
    fork {
      sort a
    }
    b' = sort b
    a' = join
    return a' + b'
As with most things in Erlang; if it's important, you must make it explicit. Implicit ordering works in your fork-join example with only a single fork, but if you require an ordering, you must be explicit about passing information through to enforce the ordering you need.

If you instead did

    fork {
      sort a
    }
    fork {
      sort b
    }
    a' = join
    b' = join
you would have the same problem as in Erlang. or you could have actor C sort B inside the actor between send a to A and receive a' and you would also have an implicit ordering.

In this case, merge sort could work with either order if a stable sort isn't required, or if the sort key is the whole element.

If it matters, this is easy to defend against, you just send a tag (a ref in Erlang would be perfect for this case, if the merge happened in a fourth actor, a numeric indication of ordering would be more useful) in the message to actors A and B, and use that to enforce an ordering when receiving the replies.

> you would have the same problem as in Erlang

Ah but that's not how fork-join works - you fork multiple jobs, and then you must join them all at the same time - you can't join just one.

You have to do something like

    (a, b) = join
If you have to join all the the jobs at the same time (which is pretty inflexible), how is the ordering of the results determined? My exposure to this model was in the perl threads module (and the forks module which offers the same api with os forking instead), where you join on a specific thread id, so you can easily enforce ordering by first joining a, and then joining b; I assumed a join with no parameters would join a thread that's ready and/or wait for the first to become ready, because that seems like the most useful/basic interface, and anything more specific (like join all the threads, or join the threads in some order) could be added as needed in the context where it's needed.
> If you have to join all the the jobs at the same time (which is pretty inflexible), how is the ordering of the results determined?

The model is that you can start a sequence of jobs to run in parallel, and then you have to wait for them all to finish. You get the results in the same order as the jobs you created. The order can't vary.

Think about a diamond shape - one job create two more jobs, and then they both send their results to the original job which cannot continue until all child jobs are finished.

> because that seems like the most useful/basic interface

Yes useful and basic, but the problem is it makes it easy to cause race conditions, which is where this thread started! You think you'll get some some thread being ready first so you write code assuming that without even thinking about it, and then once in a trillion you actually get the other result first. Yes, it's a programmer bug, but the point is because it's non-deterministic they may not notice until the one time it actually matters and someone dies.

Erlang lets you receive first message first even if it arrives last, you just have to specify which message, exactly as in your example. But order doesn't actually matter for sorting, you cannot possible make a mistake wrt to ordering here.