Hacker News new | ask | show | jobs
by Cthulhu_ 3021 days ago
How does it compare to Go channels? I only know actors from Scala, Go felt similar but more er, simple?
6 comments

Simple to fault, in that channels and Goroutines don’t prevent you from making mistakes the way actors do - ie you can still share state and memory. Go allows for the possibility of writing cleanly concurrent code, but you must study and practice.

Using an actor system, though, involves a bit more study with regards to setting things up, but it’s almost impossible to make mistakes - actors simply can’t access each other’s state (maybe some systems allow it, but you’ll have to struggle against the language to make that mistake).

Erlang does a great job of protecting data from concurrency problems, but you're still vulnerable to mistakes like deadlock.
> Erlang does a great job of protecting data from concurrency problems, but you're still vulnerable to mistakes like deadlock.

without any shared data, and an asynchronous-message-passing style concurrency (i.e sharing memory by communicating, as opposed to communicating by sharing memory) can you please elucidate how we might get into deadlocks ? thanks !

OTP offers two basic messaging options: async or synchronous. The latter is implemented by forcing the caller to wait for the response.

If process A sends a synchronous message to B, and B while processing it sends a different synchronous message either directly to A or to another process that eventually calls back to A, you end up in a deadlocked state.

> OTP offers two basic messaging options: async or synchronous.

with synchronous messaging it is quite easy to see how deadlocks can happen, without trying too hard. for async messaging, which is what i was referring to, it is quite hard (and or convoluted) to get into that state...

receive is a blocking action in erlang so something like:

  loop(State) ->
    receive
      msg -> loop(State)
    end.
Will block waiting for `msg` to show up. So if you have two processes that are interacting with eachother and aren't careful in constructing their communication you could end up with something (contrived):

  loop(Pid) ->
    receive
      msg -> Pid ! msg
    end,
    loop(Pid).
If you have two processes A and B that end up in this same loop but referencing each other, they'd deadlock. Substituting A and B for Pid you'd end up with something like:

  loop(B) ->    % Process A
    receive
      msg -> B ! msg
    end,
    loop(B).
  loop(A) ->    % Process B
    receive
      msg -> A ! msg
    end,
    loop(A).
Both waiting for `msg` but never sending it to their partner process.
> Both waiting for `msg` but never sending it to their partner process.

but then you can 'deadlock' for a single pid as well right ? where you wait for a message which no one is sending...

A single process could wait forever if no message ever arrives, yes. But it's not a deadlock since deadlock requires at least two processes that are waiting on each other in some fashion.

And responding to your lower down comment: Yes, if a third party could send `msg` to either of these it'd break the deadlock. In my example (and the cases I've caused this myself) there was no other process around to do that. But this is where, with experience, you learn to design things better and also use `after` clauses in the receive expression. This will cause them to time out and do something. Which may be to terminate and let a supervisor restart the whole thing or some other behavior.

That wouldn't count as a deadlock. A deadlock implies that it's possible to make a logical determination that the system is permanently stuck, and that state cannot possibly change without external intervention.

i.e. a deadlock can only be determined when you look at the system as a whole from the outside and determine that it's permanently stuck.

Also livelocks and race conditions. They're just harder, usually requiring some bad design decisions.
you can voluntarily write Go which doesn't share any state, although clearly if thats an invariant you'd like to keep it would be really nice for the language to enforce it for you.

more importantly I've found that as you grow such Go programs, write higher-order actors, and deal with all the error/cleanup cases, the selects start to get really brittle and you end up spending alot of time refactoring them and carefully going over all the edge cases.

I don't know about D, but Go channels and Erlang processes are sort of complements or inverses of each other in some sense.

A Channel in Go is a first-class communications bus that can be passed around as a value, and senders and receivers are implicit/not first class. Arbitrary numbers of readers and writers can use one channel. Channels are also typed; only specific messages can pass across a given channel, though it can be specified by interface.

In Erlang, you have to send a message to a specific process. Thus, the receivers (and symmetrically, the senders) are first-class objects that can be passed around, but the bus is implicit in the language. Processes may also receive any message, and should be able to deal with them. (One failure case that can occur in Erlang is a memory leak because some process is getting messages that it never receives, so they just build up in the mailbox. In practice this only happened to me maybe twice over the five years I was using Erlang, so it's not a stopper, just "something to be aware of", especially while debugging leaks.)

A positive for the Go model is that it is really easy to set up multiple readers for one writer, a common pattern, which Erlang handles somewhat gracelessly. (Yes, I am aware of the "pool" abstractions, all of which last I knew were one variation or another on "send a message to the pool coordinator to find out which process to send a message to", creating a single-process bottleneck on the pool.) There are some other nice ways to set up channel networks in Go to do some things Erlang would only be able to do with a lot more indirection and performance penalty on top of the fact that Erlang is already substantially (albeit not necessarily fatally) slower than Go. A negative for the Go model is that the way they've specified channels means that they rigidly must run in the same OS process; there is not and can not be a "network channel" in Go with the same semantics as a Go channel, because a Go channel is an "exactly once" abstraction, which is impossible to run over a network [1]. Also, on the off chance you want to "guarantee" that a given recipient will process a message, it's on you to guarantee that the channel does not "get around" to goroutines you didn't expect.

A positive for the Erlang model is that you get that sweet, sweet network transparency that makes writing Erlang-based clustered servers sweeter than any other language I know, because they defined the characteristics of their bus from the very earliest days of the language for that use case, in contrast to Go which wrote their fundamental abstraction in a way that network transparency is impossible. (Bear in mind that systems ought to be designed for that early, it is not automatic, but it is still a staggering advantage for the language.) The downside is that when you want to do anything other than have one process send a message to a specified other process, you're going to have some sort of indirection or bad API or bottleneck process or something like that. Depending on the nature of your server this price may range from utterly irrelevant to quite expensive, although I'd expect it to be your "biggest problem" quite rarely.

(I'm also only comparing the channels vs. PID-based message passing. There are other relevant issues like the shared memory in Go vs. enforced isolation in Erlang, etc.

[1]: And Go channels are "truly" exactly-once, too, so even Kafka's somewhat dodgy twisting of the term "exactly once" wouldn't be sufficient to implement them. Channels are used for memory synchronization, so it must be guaranteed that a non-buffered channel has had its message arrive on the other end because the fact the program counter of the receiver has advanced to that point in their code is something the language critically depends on, and a mere promise that it'll get there eventually, maybe twice, someday breaks that completely.

It's going to be the difference between communicating sequential processes, go tries to do this, and the actor model.

https://cstheory.stackexchange.com/questions/184/whats-the-d...

https://en.wikipedia.org/wiki/Actor_model_and_process_calcul...

Exactly! What might not be immediately obvious is that CSP works OK for concurrency, but not for distributed systems. For more details, see 2.1.3 (page 28) of https://eprints.illc.uva.nl/943/1/MoL-2015-02.text.pdf
The Go model is also faster but less safe.
golang is a great middle ground; it defaults to safe (but might be slow if you do something the wrong way), it can be fast if you avoid costly mistakes (the syntax tends to help you remember what you're actually working with as a data structure so costs are mostly upfront), and if you /need/ to get fancy and know exactly what you're doing you can use the unsafe library to work with pointers (or interface with other libraries; if you're not sure if they're safe to execute concurrently (most state based systems aren't) there's runtime.LockOSThread() to pin that thread's actions).
It's conceptually much simpler. Erlang(/Elixir) does not have separate concepts of routines and channels, you just send messages directly to a process (and it does whatever it wants with that), and not only does it embrace "don't communicate by sharing memory, share memory by communicating" it enforces it: Erlang's data types are mostly immutable and each process has its own private heap.

The language does look and feel somewhat odd though, it was inspired by prolog (so will look very odd if you've not used prolog) but doesn't do full unification (so will feel very odd if you've used prolog).

That’s because Go is simpler :)