Hacker News new | ask | show | jobs
by kitd 3785 days ago

    in case of high level platforms, languages or libraries     
    it can be claimed that Erlang virtual machine is almost 
    unique because JVM threads depend on operating system 
    schedulers, CAF which is a C++ actor library uses 
    cooperative scheduling, Golang is not fully preemptive 
    and it also applies to Python’s Twisted, Ruby’s Event 
    Machine and Nodejs.
Most languages/runtimes that are taking concurrency seriously do so using libraries or frameworks (eg Akka on the JVM) that look a lot like Erlang's under the hood. Erlang is not quite as unique as it was, but obviously baking it into the language is v useful.
1 comments

Akka is not preemptive. That's the problem with the library solutions, it's still a squarish peg in a round hole.
As far as I understand, you can make an actor reentrant which frees the thread. It's definitely a leaky concern compared to Erlang's process implementation.
My understanding is that isn't really preemption, rather it's cooperative scheduling. As soon as the scheduler hand execution to the actor, everything else in the system has to hope there isn't a bug in the actor that permanently ties up that thread.
Good point. I guess I find it like a hybrid where you don't benefit from the cooperative side and don't achieve true preemption. I think either end would be better than keeping the middle ground.
to be fair, the original article is fundamentally incorrect and Erlang is not truly preemptive either. It uses a reduction-counting based cooperative multitasking system which yields at function calls, but a badly programmed NIF can still ruin your day if you don't put it on a dirty scheduler.
You could say that no scheduler is truly pre-emptive, because it can't interrupt a process in the middle of a machine instruction.

Implementing NIFs should be done with the same care you would use adding a new machine instruction to your processor ;-)

There are N ways that badly coded C code loaded into a Unix process can send things to hell in a handbasket. "All bets are off" as they say.
In this case just taking more than a millisecond can cause scheduler collapse. So it's a pretty easy mistake to make.
Although writing C code for NIFs is not a regular task for Erlang developers, it must be done with extreme care because not only a long-running NIF could degrade the responsiveness of the VM, but also when it crashes the whole VM will crash.

However when there is no other options except writing a NIF, there are ways to protect yourself:

1. Your NIF should return less than a millisecond.

2. If the item 1 is not possible, split it into shorter NIF calls.

3. If item 1 and 2 are not possible so you have a dirty NIF. It is a NIF that cannot be split and cannot execute in a millisecond or less. There is an experimental feature in Erlang virtual machine which is called "dirty scheduler". When it is enabled some other schedulers are ready to execute the dirty NIFs, so they won't interfere with the normal operation of schedulers.

4. If item 1 and 2 are not possible and you don't want to use dirty schedulers, the +sfwi emulator flag is available to force normal schedulers to wake up again from the collapse situation.

These items are some solutions to remain in normal scheduling state even in case of writing the native functions in C (NIF), but what the article says is about just Erlang code which is run by schedulers and are preempted with no trouble as soon as they reach the reduction limit.

This. Behavior is much better these days but dirty schedulers and ports are still the first places to consider putting C code. Only when you know you've got a solid implementation should you upgrade it to a NIF.

A short note on how hard this is: Until recently in 18.0, there were many BIFs (built-in functions that are part of the VM) that could possibly cause the same scheduler collapses. If the VM developers don't always get it right, the chances that some C code will, is very small. Tools like QuickCheck can help in testing the inputs and outputs but it's hard to setup complex VM stress states and thus very hard to make guarantees about NIFs.

While there is definitely overhead, I'd say regular external "ports" (OS-level subprocessing) are quite underrated from what I see in more recent Erlang code. There's a lot that can be done this way if port communication is carefully designed.

I like to say that in NIFs (and port drivers) all the bets are off anyway, most notably process isolation and fault-tolerance. The guarantees such as "preemption" and fault-tolerance happen in the layer above (i.e. pure Erlang), and can only be satisfied if the underlying layer behaves properly.