Hacker News new | ask | show | jobs
by devishard 3438 days ago
>> Your Erlang program should just run N times faster on an N core processor

> But only if your program is embarrassingly parallel with at least N times available parallelism in the first place! If you have one of those it's already trivial to write a version that runs N times faster on N cores in C, Java, multi-process Python, whatever.

You've made two claims, one irrelevant and one false:

1. only if your program is embarrassingly parallel is irrelevant, because a almost every program is embarrassingly parallel in Erlang. The language is built around concurrency to the point that parts which wouldn't be obviously parallel in another language are in Erlang. Further, slow hashes in crypto have taught us that is actually quite difficult to make something which can't be parallelized.

2. it's trivial to write a program that's faster in X language I'm not sure how toot define trivial, but I've yet to find a language that can communicate between threads as performant-ly. Even languages like Clojure which use similar thread semantics can't do what Erlang can because the underlying threads aren't as lightweight. Spinning up a million threads in Erlang isn't even unusual, whereas in any of the languages you mention it's either crippling-ly slow (Python or Java) or very difficult to synchronize (C most, but Python and Java aren't easy).

5 comments

I've shipped Erlang code and spoken at Erlang events and even I don't believe that every program in Erlang is embarrassingly parallel. There are very real limits on what you can do.

Erlang has never been a language about amazing performance. It's been about an environment and library that delivers amazing distributed concurrency with shockingly little effort.

This is an amazing accomplishment and should not be undersold. In making unsupported claims about how Erlang has magical parallelism sauce that somehow ignores Ahmdal's law we ignore the real benefits for a fiction.

>only if your program is embarrassingly parallel is irrelevant, because a almost every program is embarrassingly parallel in Erlang. The language is built around concurrency to the point that parts which wouldn't be obviously parallel in another language are in Erlang. Further, slow hashes in crypto have taught us that is actually quite difficult to make something which can't be parallelized.

OK. Not an Erlang user, but I can't let this statement go.

I've studied parallel numerical algorithms. Many/most of them will involve blocking because you're waiting for results from other nodes.

If you're saying Erlang has somehow found a way to do those numerical algorithms without having to wait, then I'd love to see all those textbooks rewritten.

Amdahl's Law reigns supreme.

I'm not devishard, but I parsed his statement slightly differently. He's not saying that Erlang makes things parallel magically. Rather, he's saying that Erlang forces tasks that /could/ be parallel to be parallel by default. Thus, Erlang will tend to maximize the sections of your program that are run in parallel compared to other languages.
Which would make the original commenter's point valid:

>Your Erlang program should just run N times faster on an N core processor

No, it won't. It will only be true for tasks that /could/ be (completely/embarrasingly) parallel (as you say). Which is kind of circular.

Not really. Think about every object you'd have in Java that's being passed around your system. Now imagine each of those objects are their own processes and you're passing around references to them.

Just on that one case, you've taken huge chunks of a linear execution pattern and parallelized it. Now make that your norm and amplify it to everything. Now realize that the message passing allows this mode of operation to spread each part of this workload over not only more cores but more machines across the network.

And then realize that you can deploy updates to this codes individual parts while other parts continue running without taking down the whole system.

Then you get Erlang.

None of what you said will prevent the need for waiting for the majority of numerical algorithms.

No one's disputing Erlang's prowess at parallelism. What the critic in this thread was saying was that you can only get Nx speedup on an N core processor for a limited set of algorithms. Most parallel algorithms will not fall in this category. Amdahl's law is a general truth - it doesn't matter what your architecture/language is. There is nothing special about Erlang that will make any parallel algorithm scale linearly with nodes.

On top of that, Erlang has a scheduler which reduces the need to rely on the operating system's concurrency model.
I'm interpreting "embarrassingly parallel" to mean that it's obvious the task can be parallelized, and I'm saying that many tasks where this isn't obvious in a more serial language are obvious in Erlang.

No, I'm not claiming Erlang breaks Amdahl's Law. I'm claiming that Amdahl's Law applies less often than people think it does.

The prescheduler caps the execution window for anything that would block. If one piece of the system would take a long time to finish, it doesn't interfere with the other parts of the system completing their work on schedule. That one blocking piece will finish more slowly, but ever other moving part in the system will keep responding as expected.
>If one piece of the system would take a long time to finish, it doesn't interfere with the other parts of the system completing their work on schedule. That one blocking piece will finish more slowly, but ever other moving part in the system will keep responding as expected.

That's how it works in pretty much any language that supports message passing. I used to do MPI programming in C. Nothing you said is not true for MPI in C. Later I did some MPI in Python. True there as well. If you have MPI in any language, it is true for that language.

My understanding is that those languages rely on cooperative scheduling within a thread, meaning that the running code has to relinquish control to the scheduler. Threads themselves are prescheduled at the OS layer but OS threads are much heavier and limited in how many can be running. A Java thread is 1024kb for example, compared to an Erlang process that's 0.5kb.
>My understanding is that those languages rely on cooperative scheduling within a thread

MPI doesn't require threads.

I think people here are confusing parallel programming with multithreaded programming. One is a subset of the other.

I'm not saying Amdahl's Law is wrong, I'm saying that it doesn't apply to as many problems as people think it does. I said "almost" for a reason.
> almost every program is embarrassingly parallel in Erlang

I can't agree with that, and it's key to my point. There are some problems which we just don't know how to make embarrassingly parallel. Take something classic like mesh triangulation or mesh refinement. Nobody knows how to make those embarrassingly parallel. If you write it in Erlang, it's still not going to be embarrassingly parallel. And it won't scale linearly to N times faster on N cores no matter which language you write it in.

So it's just not true to say that any Erlang program should scale linearly. If nobody on earth knows how to make mesh refinement scale linearly, how will Erlang do it?

Maybe you mean you wouldn't choose to write those programs in Erlang? Well then I think it's a meaningless claim to say Erlang will linearly scale your program, but only if it is a program which is naturally linearly scalable anyway. Erlang hasn't helped you do anything there so why make a claim about it?

"almost every program" != "any program"
WTF. How do you turn a problem which cannot be reduced into parallel sub-components into an Erlang program then? Are there just a huge class of computations that cannot be done with Erlang? Of course not.
Of course you can implement an inherently serial algorithm in Erlang. It just won't run N times faster on N cores.
> I've yet to find a language that can communicate between threads as performant-ly.

Minor nitpick: Erlang uses processes, not threads. Your comment doesn't explicitly say that Erlang uses threads though, so perhaps I'm being overly pedantic.

But an Erlang process is not the same as an OS process, or even an OS thread. It's managed by the Erlang VM and is very lightweight.