Hacker News new | ask | show | jobs
by sasa555 3429 days ago
It is possible to start external processes from BEAM and interact with them. I've blogged a bit about it at http://theerlangelist.com/article/outside_elixir

You can also write NIFs (native implemented functions) which run in BEAM process (see http://andrealeopardi.com/posts/using-c-from-elixir-with-nif...). The latter option should be the last resort though, because it can violate safety guarantees of BEAM, in particular fault-tolerance and fair scheduling.

So using BEAM facing language as a "controller plane" while resorting to other languages in special cases is definitely a viable option.

2 comments

I spent 30 minutes looking at NIF, but I was scared away. My understanding is that if the NIF crashes then BEAM crashes. Which leads me to think that if you need NIF then you need safety guarantees on the Native side that C can't provide.
Think of NIFs as Erlang's equivalent to Rust's unsafe{} blocks. It's where you write the implementations of library functions that make system calls, and the like. But, like unsafe{} blocks, you do as little as possible within them.

For example, if you want to call some C API from Erlang where the C API takes a struct and returns a struct, you'll want to actually populate the request struct--and parse the return struct--on the Erlang side, using binary pattern matching. The C code should just take the buffer from enif_get_binary, cast it into the req struct, make the call, cast the result back to a buffer and pass it to enif_make_binary(), and then return that binary. No C "logic" that could be potentially screwed up. Just glue to let Erlang talk to a function it couldn't otherwise talk to. Erlang is the one doing the talking.

On the other hand, if you have a big, fat library of C code, and you want to expose it all to Erlang? Yeah, that's not what NIFs are for. (Port drivers can do that, but you're about the right amount of terrified of them here: they're for special occasions, like OpenSSL.)

The "right" approach with some random untrusted third-party lib, is to 1. write a small C driver program for that library, and then 2. use Erlang to talk to it over some IPC mechanism (most easily, its stdio, which Erlang supports a particular protocol for.)

If you need more speed, you can still keep the process external: in the C process, create a SHM handle, and pass it to Erlang over your IPC mechanism. Write a NIF whose job is just to read from/write to that handle. Now do your blits using that NIF API. If the lib crashes, the SHM handle goes away, so handle that in a check in the NIF. Other than that, you're "safe."

Precisely, which is why I always advise to consider ports first :-)

However, in some situations the overhead of communicating with a port might be too large, so then you have two options:

  1. Move more code to another language which you run as a port.
  2. Use a NIF
It's hard to generalize, but I'd likely consider option 1 first.

If you go for a NIF, you can try to keep its code as simple as possible which should reduce the chances of crashing. You can also consider extracting out the minimum BEAM part which uses the NIF into a separate BEAM node which runs on the same machine. That will reduce the failure surface if the NIF crashes.

I've also seen people implementing NIFs in Rust for better safety, so that's another option to consider.

So there are a lot of options, but as I said, NIF would usually be my last choice precisely for the reason you mention :-)

Aren't dirty NIFs on the horizon as well which help with the whole scheduling issues currently associated with NIFs?
Dirty schedulers can help with long running NIFs, but they can't help with e.g. a segfault in a NIF taking down the entire system.
Apparently people are working on this using Rust for writing NIFs https://github.com/hansihe/rustler
Love your blog and book Sasa. Could elaborate on the fair scheduling disruption by NIFs? Don't recall ever reading about that
Thanks, nice to hear that!

Basically a NIF blocks the scheduler, so if you run a tight loop for a long time, there will be no preemption. Therefore, invoking foo(), where a foo is a NIF which runs for say 10 seconds, means a single process will get 10 seconds of uninterrupted scheduler time, which is way more than other processes not calling that NIF.

There are ways of addressing that (called dirty schedulers), but the thing is that you need to be aware of the issue in the first place.

If due to some bug a NIF implementation ends up in an infinite loop, then the scheduler will be blocked forever, and the only way to fix it is to restart the whole system. That is btw. a property of all cooperative schedulers, so it can happen in Go as well.

In contrast, if you're not using NIFs, I can't think of any Erlang/Elixir program that will block the scheduler forever, and assuming I'm right, that problem is completely off the table.

As linked elsewhere here, tight loops that never preempt are being fixed in Go 1.8/1.9[0]. Looks like a flag may been added to Go 1.8 called "GOEXPERIMENT=preemptibleloops" that adds a preemptible point at the end of a loop. It's behind a flag for performance/testing reasons, but they are working on it.

[0] https://github.com/golang/go/issues/10958

Won't pre-emptible loops lead to more irreproducible race conditions as a negative consequence, unless the preemption is done deterministically?
Are you asking about BEAM or Go? Preemption already works in BEAM and doesn't lead to race conditions because of nothing shared concurrency.
I was asking about Go. I understand BEAM's advantages in that area