Hacker News new | ask | show | jobs
by cdumler 3972 days ago
No, Erlang does not mean that anything you write will scale. Your solution has to be broken into parallelizable pieces. But, the scalability of your solution is only as good as the mechanisms within the language and runtime to efficiently allow developers to create a scalable system. BEAM implements several nice features:

Data is immutable, so we don't have to worry about keeping data coherent between.. anything. Whether it is two processes or two nodes. New data is can be constructed with reference to old data without fear that the old data will be modified. So, "mutation" is really just new data with a reference to the old unchanged data. This greatly lowers the churn in creating new data. It also means everything can just pass (process to process or node to node) what it has without feature it will be out of date.

Everything is defined in modules. Modules define what we would think of in OOP as namespaces, structures, classes/types, and class functions. Importantly, they only define functionality. Modules do not have state. Therefore, functions accept some set of inputs, create new data from the inputs (no mutation), and return some output. This makes it very easy reason about what the code is doing if you keep the modules well defined and reasonably sized. This code can be shared around easily, too. It's got no state and is immutable.

Processes are an abstraction. You can think of them as thread, but they're really just a stack and and a little book keeping. A BEAM VM will normally of real threads equal to the number of CPUs in the machine. Each real thread will then exclusively pick a process, load the book keeping, point itself to the stack, and execute bytecode for a period of time. When done, it will mark the changes in the book keeping, and move to the next process. This is very lightweight, so literally millions can run on a single computer. Because they are self contained, they're easy to clean up. Processes also expose standard set of interfaces for communication, a pub/sub system. Again, immutable messages are sent back and forth. So, it doesn't matter if it's the same node or not.

Finally, everything is abstracted to the notion of nodes with in a cluster. By default, anything you executes on the local node, but you can specify otherwise. I can execute a module call on another machine or spawn a new process on another machine. It just means a little more information in the call, but it's the same exact concept programmatically. Also, it's possible group processes into named services. You can call a named service and it will know what processes to contact. It's a very low barrier to entry to parallelize your code if you just write it that way.

When you start thinking in terms of how structure you code for BEAM, you inherently get easy access to scalability.

1 comments

> Data is immutable...

But you don't need that at the VM level. Clojure does that on the JVM. Having that at the VM level makes a simple GC work reasonably well, but HotSpot has world-class GCs that perform better, even without the assumption of immutability.

> Everything is defined in modules.

Again, that's a language-level feature.

> Processes are an abstraction

You can get that on the JVM, too.

> Finally, everything is abstracted to the notion of nodes with in a cluster.

That's the runtime library's concern. Not the VM's.

> When you start thinking in terms of how structure you code for BEAM, you inherently get easy access to scalability.

All of that is great, but implementing those features at the language/library level and harnessing HotSpot's power would give you that same easy access to even greater scalability.

Hey man, some people (like me) just don't like to work on the JVM, despite it's advantages and superior features like the GC. Just accept it.

Having worked with java, scala, jruby and closure, something always is clumsy, be it interfacing with java cruft, slow startup times of the vm, maven & co... While there are solutions to fix the solutions, its just annoying for me. I get what you say, but nevertheless are JVM (and .NET) based things nothing i'd use (except i'm forced to do so).

> Hey man, some people (like me) just don't like to work on the JVM, despite it's advantages and superior features like the GC. Just accept it.

I accept it, but the fact of the matter is that -- like it or not -- there are at least two orders of magnitude more people who use the JVM than BEAM. You're comparing the world's most popular runtime with a runtime that's not even in the top-ten.

Some of your complaints stem from exactly that difference -- the JVM is designed to operate much higher workloads than BEAM, and people use that -- hence Clojure's slow startup etc (the JVM starts up in < 80ms, BTW). But, again, your observations don't change the fact that if Erlang stays on BEAM it will forever be a niche language.

Becoming ultra popular isnt an advantage at all. Look how much crap gets produced with JS or PHP, tons of crappy libraries shallow the few good ones, people write throwaway code like there is no tomorrow and real (tm) developers have to maintain this mess. And yes, I have seen code in my day job written in Java or Scala that was nearly impossible to even understand. Much Code.

Again it's obvious that the JVM is dramatically more in use than BEAM, but I like it so far. Up to now people only stumpled upon erlang when they actually needed it, now with elixir & co a few others discover that BEAM might be exactly what they need. This shows in the ecosystem, and the community of #elixir is by far the nicest I have met so far.

Just personal experience, yours might differ (obviously). Just to reiterate: Becoming too popular results nearly everytime in garbage for everyone. Just look how much stuff gets crammed in JS nowadays.