Hacker News new | ask | show | jobs
by hinkley 2233 days ago
Can someone steer me to some good benchmarks, discussions of perf characteristics and gotchas of the BEAM? My search-fu is weak and I'm not finding the sort of content I'm after.

I'm trying to learn Elixir and being a systems thinker so before I (can) get too comfortable I'm gonna want to dive into origin stories to build up my holistic map of why things are the way they are, what can be done and what can't be done, and understanding bottlenecks in the BEAM seems like it's gonna have to be part of that (the way I studied JVM tech documentation when I did perf and architecture work in Java)

2 comments

I think you're going to have a hard time finding what you're after. Erlang in Anger [1] might be the closest, it will at least show some of the gotchas you run into.

From my experience, the gotchas tend to hit with emergent behavior, which is hard to benchmark, and may be repeatable in production, but is hard to model in a testing framework.

I'm not sure how much impact off-heap messaging has had, but the basic gotcha is that as a process gets bigger, it tends to run slower (because GC over more memory takes longer), and develop a larger message queue, which makes it slower. You need to have backpressure in your system, or small blips in procesing can blow up to huge messaging queues that can't be processed. Monitoring for overall queue size and maximum queue size is an important health indicator.

The other basic gotcha is that Erlang/OTP tends to default to 'unlimited' resource limits and 'infinity' time outs. You often want to have limits, and timeouts, but a general system doesn't know what you want. Sometimes, the unlimited settings result in terrible system behavior if you hit larger numbers than anyone else tested, but if you hit this, it's usually easy to fix.

A good thing about OTP is that they've written as much as possible of the environment in Erlang itself, so it's easier to change things when needed than a system where most of the provided apis are implemented in C.

[1] https://erlang-in-anger.com/

In general I would say there's no good single book or resource that describes everything comprehensively. There's a lot of resources, though, but mostly scattered in various places.

The BEAM Book [1] is a good, though unfinished resource talking in general about the implementation - the memory model and the interpreter.

If you're interested in some very low-level details of the runtime, the internal documentation [2] also holds a lot of interesting details.

There are also some additional details on internals at Spawned Shelter [3].

[1]: https://blog.stenmans.org/theBeamBook/ [2]: https://github.com/erlang/otp/tree/master/erts/emulator/inte... [3]: http://spawnedshelter.com/#erlang-design-choices-and-beam-in...

I want to know the constraints to, and evolution of, sequential computation on the BEAM. I want to form opinions on how that landscape is likely to change within the lifespan of a project I'm affiliated with.

I get mostly false positives trying to find those sorts of discussions or metrics.

I am not sure i understand the problem you are trying to find information about. Maybe explain it a little bit more ? or go ask for it in the elixir forum, people can try to be your librarians there
To an outsider, it seems like the BEAM documentation [and particularly, videos] go out of their way to discuss how process management and IPC communication works and how certain classes of data are managed. They talk about what makes the BEAM the BEAM to exclusion of all other concerns.

Prior to finding this document (http://www.cs-lab.org/historical_beam_instruction_set.html) I had no idea whether you could actually do computation on the BEAM. I was starting to wonder if they had misappropriated the term VM, and some sort of inline assembly trick was being used for everything but control flow and IPC.

Interpreted code has very, very real computational constraints and you can't assume people will know this, even now. Especially if your system is noteworthy for how it is not like other systems. Where does it stop being 'weird' and start being conventional? The boundaries describe both sides of a distinction. Even if you're only interested in the exotic part, leave some breadcrumbs for others.

Hmm, are you maybe after this page?

http://erlang.org/doc/efficiency_guide/advanced.html

> [and particularly, videos] go out of their way to discuss how process management and IPC communication works

After watching a bunch of videos on BEAM/erlang/elixir I came to the conclusion that it isn't a platform for computation, it's a platform for communication. The best video (by far) was The Soul of Erlang and Elixir • Saša Jurić https://www.youtube.com/watch?v=JvBT4XBdoUE

Two shallow benchmarks:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

https://www.techempower.com/benchmarks/ (phoenix is at 175)

I still don't understand what you are talking about, but i suppose this is more due to my own limitations in missing a lot of the background you come from. Sorry that i am not that useful :(