Hacker News new | ask | show | jobs
by Scorponok 5169 days ago
I've been curious about this for a while: If you can do concurrent processing so easily by message passing, why not just write Java or C++ that just passes messages around instead of sharing state? That way you don't have to learn a whole new programming language to do it.
12 comments

You can do that, but as far as I know Erlang (not far, mind you) it takes care of a ton of details for you. For instance to send a message to another actor, whether they are on you machine or a remote machine you just need a PID (process identifier, an Erlang id concept similar to, but not the same as an OS level PID). In C++ you would either need to deal with local/remote differences in message transport (ignoring the non-trivial nature of serializing arbitrary types in C++) or create an abstraction that would do so on your behalf. Second, in Erlang all actors have their own heaps, so their local data is spatially close in memory by definition making actors NUMA/cache friendly (assuming the runtime is "doing it right") heap allocated data passed between actors in C++ wouldn't have that attribute by default (again, you could make it so with some effort, but it certainly isn't free). Third actors in Erlang don't use full OS level threads, so the overhead of spawning say 10k of them is not the overhead of spawning 10k C++ threads (also ignoring that C++ really only acknowledged the existence of threads in C++11). These are a few issues/benefits, but as I said you could do all of this in any language mostly, but in Erlang a lot of the gross/tricky details are already handled for you.
In other words, it would require Greenspunning Erlang.
Rob Pike and the rest of the (ex)Bell Labs/Unix gang have been on both sides of this for over 20 years: both with Squeak/Alef/Limbo, then switching to using C with libthread (http://man.cat-v.org/plan_9/2/thread not to be confused with pthreads, for details see: http://swtch.com/~rsc/thread/ ), but in the end having language support for concurrency is something that is really worth it, and one of the reasons the ended up building Go.
You can build a library for doing Erlang-style programming, but there are lots of features that you can't reproduce.

(1) In C++ you may get hard crashes, so to get reliability you need to use high-overhead OS processes. Erlang allows you to hundreds of thousands (yes, really) of processes because they are really lightweight (76 words + heap/stack). Also, copying within the same address space has much lower overhead (no system call/context switch) and serialisation is much simpler.

(2) Because of immutable data you can do have nice fault tolerance as follows. Imagine you have a request handler function that takes a state and a request. However, the request is malformed or triggers a bug in the handler. In a language that doesn't enforce immutability you have to restart the handler process because the state may be in an inconsistent form. In Erlang you would just discard the request (or send an error message) and handle the next request with the unmodified state.

(3) Pattern matching makes receiving a message very convenient.

I probably forgot a few things, but this gives you an idea.

The feature those languages are missing in order to do actors correctly, is enforced isolation of state. While you can emulate actors and message passing in nearly any language, you need this guarantee so that any actor can be executed anywhere at any time, without worrying that it is going to cause a race condition or hit a mutex somewhere. The big sin in Java and C++ which makes them quite unsuitable is "static"

It might be possible to achieve that isolation if you stick to strict coding practices, but the point where it's going to fail is when you import another library, written by someone else.

Because C++, Java and many other languages (including F#, Scala and others which have message passing libraries) do not have the enforced isolation of state, nor the ability to add notation to indicate that some piece of code is free of side effects (including all of it's dependencies), you're always going to have the risk of breakage when importing a library. The only way you can be sure that a library is actor-safe is to read it's source code - and if the source code is not available, then you're out of luck.

Java and C++ could have the potential to do actors correctly by adding the notation for side-effect-free code, using custom annotations/attributes on all functions which are actor-safe, and using some compiler extension or static analysis tools to prove correctness. I'm not aware of anything that does this for the mentioned languages though.

On the other hand, if you implement message passing in any purely functional programming language, your actors are automatically safe for free.

I guess it is easier deal with them and using them is more idiomatic in Erlang. For example you can write your own persistent data structures in Java and use them in your concurrent program. Actually you don't have to write your own, Clojure's persistent data structures written in Java and you can use them in Java, but these data structures makes much more sense with the rest of Clojure's tools, syntax, semantics...
See Akka: http://akka.io/
Keep in mind that in a language like erlang this is enforced for all it's users. So the whole eco-system is build around that premise.
I heard Jonas Bonér (Akka guy) talk about his motivations for building Akka. He said he started the project because he fell in love with Erlang but couldn't convince enough companies to deploy it. So Akka is a port is the essential ideas to Scala and Java.
That's what I've done on maybe my last 5 projects. In C++, Java and Python.

Not sure why this is so tough for people to understand. You usually don't need a whole new language to use a paradigm.

You don't need anything but assembler. Yet you just named three languages and not a one was assembler. Once we've admitted that the niceties offered by more specialized languages are worthwhile, that "you don't need a new language" argument becomes a whole lot less compelling. For all you know, maybe programming this stuff in C++ is like programming sequential code in assembler — dismissing it just because the tool you have can be made to do the same thing isn't bad (I'd never disparage people who accomplish things), but it doesn't make a very compelling case for anything besides your own comfort zone.
What you've just laid out is what I guess people call a "straw man". No one is arguing assembler over C here, just like no one is arguing the virtues of vacuum tubes over the transistor.

I'll say it again: the benefits of the concurrency model described in the original article can probably be achieved in peoples' existing/preferred dev env. It's easier and faster to look into that then to throw what you have under the bus for erlang.

If you want to get into logical parlance, it isn't a straw man, it's reductio ad absurdum. That is, you take somebody's reasoning and apply it to a different situation that illustrates the problems in the reasoning more clearly.

In this case, your thinking appears to be, "I can accomplish the same thing in C++, so why use Erlang?" My point is that an assembly programmer might just as well say, "I can accomplish the same thing in assembly, so why use C++?" The answer is that it's a lot easier and more natural to write OO code in C++ than it is to use, say, assembler macros. The fact that something is possible is less interesting than how simple and well-supported it is.

Again, I'm not saying your choice is necessarily wrong. For your use case, it might have been right. The benefits of using a language with pervasive and highly developed support for the paradigm might not outweigh the benefits of using C++ for you. But that reflects more on your personal circumstances than on the benefits of Erlang in general.

>In this case, your thinking appears to be, "I can accomplish the same thing in C++, so why use Erlang?" My point is that an assembly programmer might just as well say, "I can accomplish the same thing in assembly, so why use C++?" The answer is that it's a lot easier and more natural to write OO code in C++ than it is to use, say, assembler macros. The fact that something is possible is less interesting than how simple and well-supported it is.

What you seem to miss is that C++ also has some inherent value that cannot be replicated in Erlang. Namely: C++ is not assembly. You are already working on a higher level language. So the question is not "assembly or some higher level" but "how high a level should I go"?

Your response is essentially: "you should go to the highest level you can for concurrency, which (in your opinion) is Erlang".

The problem with that: you essentially reduce the whole problem domain to handling concurrency. Not what I call a good engineering analysis.

How about reusing HUGE EXISTING C++ libraries for his problem domain, instead of replicating them in Erlang?

How about reusing a ready team of C++ experts in his company, instead of retraining them in Erlang?

How about reusing existing tooling and infrastructure his company has for C++, instead of throwing it and using Erlang?

How about interfacing with external systems for which he has C++ drivers, but no Erlang ones?

You say: "The fact that something is possible is less interesting than how simple and well-supported it is". Maybe. But well supported is also not just a language attribute. How well supported it is within the industry, within his company, with his toolset, with the code he has etc?

You basically just restated the last paragraph of my comment as though it were something I haven't thought of.

I'm not trying to tell everyone to use Erlang. Getting things done is more important than the tool you use to do it, so whatever does that for you is good. But I'd be really amazed if somebody had used both Erlang and a C++ actor library and come out thinking that C++ was about as good as Erlang at its own game.

> the benefits of the concurrency model described in the original article can probably be achieved in peoples' existing/preferred dev env.

No, they can't. This has been well explained elsewhere. You'd have to rework the language/dev environment down to at least the C level if not assembly.

>It's easier and faster to look into that then to throw what you have under the bus for erlang.

No, it is slower and harder.

People think Erlang is difficult simply because the syntax is weird. Spend a couple weeks learning it and you'll be up to speed.

I think people are scared off by the syntax and so are trying to rationalize that they don't really need erlang.

It is possible to replicate erlang elsewhere, but you'd have to replicate erlang. You can't just add a library to ruby and get it.

The problem with the syntax isn't that it's weird. It's just ugly. Even Perl and C++ look nicer. It's the 1998-geocities-site-full-of-animated-gifs of programming languages.
I thought this way about Erlang code at first too, but after a short time I came around and now I think it can be extremely elegant and even beautiful.
> It's just ugly. Even Perl and C++ look nicer.

Totally disagree. Erlang is odd, but you have to understand what you're getting with that. Variable unification, etc., if very powerful (kind of like Haskell's pattern matching but a bit more powerful).

Why not just whip up a simple method dispatch mechanism in C and skip Java/C++ entirely?

edit: </internet sarcasm> ducks

Sure, if you like. I like (for example) the RAII and templating features of C++, but if you want to just use C go for it. I guess what I really mean is, what does Erlang offer beyond "concurrency is easy if you don't share state"?
I believe the point that oconnore was trying to make is that asking what Erlang offer besides message passing concurrency is like asking what C++ offers beyond RAII and templates. You could add that sort of functionality to C, but it's not going to be as nice as just using C++. A lot of the things which make Erlang ugly are safe guards that prevent the kinds of bugs that allow you to accidentally share state. For instance, if I send a pointer between distributed systems, my program is hosed. Any distributed C++ library is going to either be extremely limited in the kinds of objects it can send or, more likely, just declare via fiat that it's the programmers responsibility not to send anything with a pointer. That's okay until the moment that you import a third party library and you have to go through every object to make sure that it doesn't use a pointer somewhere as a private variable. Meanwhile, with Erlang, there aren't any pointers, so I never even have to think about this stuff.
Yes, which one can I use?
You could re-implement erlang in C++, but then you'd end up with erlang again. Why not just use erlang in the first place?

I don't believe it is possible to do real concurrency on the JVM, without rewriting key parts of the JVM.

> You could re-implement erlang in C++, but then you'd end up with erlang again. Why not just use erlang in the first place?

Because rewriting or shimming the 10MM lines of code in the libraries one depends on outweighs this one particular benefit of erlang.

> I don't believe it is possible to do real concurrency on the JVM, without rewriting key parts of the JVM

Define "real concurrency".

This one particular benefit is not available in any other language, unless you start with a relatively low level language like C and recreate it.

Multi-threading is doable in any number of languages, but the problems inherent with it are why people move to concurrency.

Concurrency needs to be supported in the language itself.

You'd spend 20 years recreating this in C++ vs 2 weeks learning erlang.