Hacker News new | ask | show | jobs
by discordianfish 2118 days ago
Yes, you need to learn the BEAM VM which makes it not easy to learn. It still might be worth it, but now you need to understand Elixir, Erlang, BEAM - additionally to your OS's API.

All I can offer are anecdotes but I've worked at one company with a sizeable rabbitmq cluster where we were fortunate enough to have a guy in a different team who was a BEAM maintainer in his free time. He was the only one able to debug issues (mostly around unexpected memory usage).

At another place, we had some in house elixir services. Developers were very happy and productive with it, but we also ran into unexpected memory usage and crashes due to that and to the day their root cause are unclear, even after adding detailed telemetry to monitor process counts and memory profiles.

I guess I have a general issue with VM based languages, but that might be due to the fact that I spend more time debugging issues than writing software and I appreciate the common libc/syscall api used in VM-less languages.

1 comments

> unexpected memory usage

Number one cause of hard-to-track memory usage in elixir, especially with services that parse JSON. Someone is ingesting JSON and caching a snippet of the JSON in an ets table, which means the entire JSON is never GC'd.

This is pretty easy to identify and fix once you know about it and how to look.

To identify:

erlang:memory() shows unexpectedly large word count for binaries.

Crawl all your ets tables and look if referenced_byte_size(Binary) is significantly larger than size(Binary) (you need to check the keys as well as the values, and you'll need to write something to crawl through your complex values, of course)

If you don't find it there, it's probably references held in processes. You can crawl processes with process_info(Pid, binary), but it's a little tricky because the value you get back is undocumented.

If you find a lot of memory used by binaries, the solutions are all pretty simple:

a) if it's binaries held in processes, and those processes really don't hang onto binaries, just receive and send them, and you're running older erlang (before 19?), try to upgrade --- there's a GC change around then to include refC binary size when checking to see if process GC should run.

b) anywhere else, do StorableBinary = binary:copy(Binary) and store StoredableBinary instead of Binary. This gets you an exactly sized binary (heap or ref-counted), instead of a binary referenced into another.

Parsing larger binaries (like JSON) is one way to get these overlarge binaries, but they're also pretty easy to get with idiomatic code like shown in the efficiency guide[1]; when the system builds a appendable refC binary, it makes appending efficient, but storage inefficient.

[1] http://erlang.org/doc/efficiency_guide/binaryhandling.html#c...