Hacker News new | ask | show | jobs
by MathMonkeyMan 1066 days ago
Multitasking systems gave us processes.

But those were too much.

So we got threads, which are processes that share an address space, file table, and some other things. The scheduler can switch from one to the other more easily than between processes, and data can be shared between threads without needing serialization.

But those were too much.

So we got user space threads, which are logical threads of execution that are driven by a runtime entirely in user space. The runtime adds scheduling hooks into all I/O functions in the standard library, or even uses a system API like Unix signals to preempt logical threads. No system-level context switching is needed. User space threads can be tiny.

But those were too much.

So we got coroutines, which allow a programmer to define logical "threads" of execution that cooperatively interact with each other. There is no assumption about the presence of a scheduler. The programmer either writes their own event loop or invokes one from a library in a "real" logical thread.

I wonder what comes next. As far as [communicating sequential processes][1] are concerned, maybe cooperative coroutines are a low as you can go.

[1]: https://www.cs.cmu.edu/~crary/819-f09/Hoare78.pdf

4 comments

Even coroutines can't save us programmers from the dread of having to explain every little possible detail to the computer, in a way that doesn't take forever to execute if you provide input that is even slighly different from what the programmer had in mind
I'm a bit surprised about the lack of languages that parallelizes code automatically as much as possible, but only as far as measured performance suggests. It requires a semantics that supports concurrency, for example iteration over sequences needs to be in unspecified order by default, and also rests on whole program flow analysis, but I see no principle obstacles.

Hasn't Microsoft done some research on such a language years ago already? There is also ParaSail made by someone from the Ada community. What happened to these projects? Nobody uses them?

Languages built on the BEAM VM like Erlang and Elixir support concurrency at the runtime level, though it's up to you to specify when you want to run something in parallel. Or am I missing why they don't fulfill your requirements?

I'm not sure if it's desirable to parallelize code automatically, as in many cases you do need at least _some_ parts of the code to run synchronously. But it's an interesting thought experiment to have parallelism be the default instead of opt-in.

Guy Steele Jr. worked on a language called "Fortress." There are a few good talks that he gives:

- (toy problem) https://www.youtube.com/watch?v=ftcIcn8AmSY

- (parallelism) https://www.youtube.com/watch?v=dPK6t7echuA

- (languages) https://www.youtube.com/watch?v=dCuZkaaou0Q

Turns out that autopar is an extremely hard problem beyond simple use cases. And for simple use cases existing solutions work fine (pragma omp for goes a long way for example).
I don’t think there’s any further down to go, but there’s probably room to go up to distributed computing. IIRC, some early versions of Go when it was in alpha had channels work across machines.
I think there is. How do you describe data parallelism? coroutine are too big when all your function call are doing the same things, but on a different data (think ISPC, shaders, cuda). There is still one more step in parallelism where coroutine cost to much :-)
Good point. SIMD routines!
Imagining next step could be something like: "process this collection of task-items as you wish". If you squint, any concurrent execution can be viewed as sequence of tasks (which can also be collections), even if there is only one or two of them.
Something along what Cuda, OpenCL or ISPC do, but integrated to the language.