Hacker News new | ask | show | jobs
by eulerfx234 3663 days ago
I'm an engineer at Jet and hopefully I'll be able to answer your question. The concurrency model within F# is based on continuations. The type Async<'a> = (('a -> unit) -> unit) - its a function which accepts a callback to be notified when the async operation completes. The existing .NET ThreadPool is used to schedule these continuations across OS threads. It uses a trampoline to tame the stack. The ThreadPool itself is a fairly sophisticated piece of work, with scaling heuristics, work-stealing queues, etc. The ThreadPool interacts with the Windows IO completion port multiplexer for IO. We extend this primitive in a variety of ways, notably into an AsyncSeq<'a> which in Haskell terms is ListT Async - a linked list interleaved with Async. We use this for stream processing, sockets, fault tolerance, etc. Async is very similar to Haskell's IO monad, although the representations are a bit different.

However, both Async and IO are insufficient to represent disjunctions. To that end, another concurrency library that we use is Hopac. Hopac is an F# implementation of CML, with some differences. Hopac provides a notion of an alternative (called event in CML; think Haskel's Alternative typeclass if you relax the laws a bit) and synchronous channels (note that async channels are special cases of sync channels). Hopac's has experimental support for lawful MonadPlus as well (see transactional events in Haskell = IO + STM + CML).

Some things that we're heading towards next are generalizing STM to be a bit to be more like RCU (see relativistic Haskell). Additionally, we are experimenting with extending this to session types, but nothing in production yet.

F# also provides a MailboxProcessor, which is similar to an Erlang actor, however without explicit distribution support, so perhaps more of an "agent". We typically use this as a low-level concurrency primitive, rather than a full-blown programming model. Most of our services are compositions of various request/reply interactions, and the Async model above is a great fit for this. In fact, we've primitives centered around the notion of an arrow 'a -> Async<'b> (specialized to Async). These primitives provide support for fault tolerance, logging, tracing, etc.

All of this works well with the GC. Async does cause allocations of course, but this is a price we're more than willing to pay. We've shared some GC dumps with the designer of the .NET GC and she believes they are sensible for a functional language. Hopac took optimization to a greater extreme, reducing allocations where possible.

Another F# library of interest is MBrace. This takes the notion of Async and fits it with a scheduler that schedules across a cluster rather than an individual instance.

Hopefully this helps!

3 comments

Also, to compare with other languages:

Async is similar to a Future (such as in Java), with the difference that Future produces a result once and caches it, whereas Async re-evaluates each time it is executed. It can be made to cache of course. A Future is more like a TPL Task, though IMO, Async provides a more predictable programming model.

Go has go-routines and channels. A go-routine is similar to Async. In essence, they are a notion of light-weigh thread. Note however that Hopac support for channels is far richer than that of Go.

> async channels are special cases of sync channels

You got it the other way around - synchronous is a special case of asynchronous, because any synchronous result or stream can be processed asynchronously, but for having guaranteed synchronous results you're adding restrictions. And going the other way, from async to sync is not possible without blocking threads, which is an error prone, platform specific hack. Take the possibility of blocking threads away and you'll notice the true nature of these models.

In CML/Hopac, async channels (buffers) are implemented in terms of sync channels - there is not async channel primitive built in. Synchronization is the essence of this model. When an operation is waiting on a matching communication through a channel, it is suspended, but no OS thread is blocked.

But yes, going from async to sync requires blocking, which is why CML/Hopac takes the approach of making sync the core primitive.

Thank you, that satisfies my curiosity.