|
|
|
|
|
by RobertKerans
2800 days ago
|
|
Out-of-the-box, it's only really designed to scale to a certain point. It's all generally nice, predictable, and with with low latency up until that point. But because the nodes are fully meshed, the TCP heartbeats alone kill performance once you go past that point. So for ex. 100 nodes gives you 5050 TCP connections (100+99+98+97...etc). As parent says, there are methods to deal with this (Riak Core would be an example). But they're non-trivial. + you possibly need to bear in mind the design goals: that Erlang is designed to run as a highly reliable, self contained system in _a single geographic location_, with that system possibly left to run on its own for long periods of time (years) |
|
I've never thought about the dist heartbeats as a scalaing problem. If you have thousands of dist nodes, and your nodes have small memory, dist buffers for each connection to add up -- I think the default is 8mb, you can tune it, but it's a scaling concern. Especially, if you have nodes far apart from each other.
Really, the root design of Erlang was for two nodes colocated in a single chassis. That said, it turns out the design scales pretty well to much larger numbers of nodes, and nodes farther apart, but you have to be careful with some things. pg2:join and leave operate under a global lock, which will be slow if you have contention on the lock, or if one of your nodes has some problem where it's still up but very slow. Mnesia doesn't do well with queuing without a lot of help, schema operations under queuing is definitely a bad idea as well.
If you want to run Erlang at larger scales, you will need to be ready to poke around in OTP, and ocassionaly in BEAM as well. If you're running big systems, IMHO it makes the most sense for your Erlang nodes to fill your physical nodes, so I don't see much need for containers, but if you do use containers, you need to figure out how to get their names consistent for Erlang, or it's going to be confused. (OTP has a concept of a 'diskless' node which would seem to be a good fit for an ephemeral systems environment, but I must admit I haven't played with that)