|
Here's a trick to confidence in a BEAM system. If you get good at hot loading, you significantly reduce the cost of deployment, and you don't need as much pre-push confidence. You can do things like "I think this works, and if it crashes, I'll revert or fix forward right away" that just aren't a good fit for a more common deployment pattern where you build the software, then build a container, then start new instances, then move traffic, etc. Of course, there are some changes that you need confidence in before you push, but for lots of things, a bit crashy as an intermediate step is acceptable. As for understanding the OTP stuff, I think you have to be willing to look at their code. Most of it fits into the 'as simple as possible' mold, although there's some places where the use case is complex and it shows in the code, or performance needs trumped simplicity. There's also a lot of implicitness for interaction between processes. That takes a bit of getting used to, but I try to just mentally model each process in isolation: what does it do when it receives a message, does that make sense, does it need to change; and not worry about the sender at that time. Typically, when every process is individually correct, the whole system is correct; of course, if that always worked, distributed systems would be very boring and they're not. |
Because it's possible to do hot code reloading, and since you can attach a REPL session into a running BEAM process, running 24/7 production Erlang systems - rather counterintuitively - can encourage somewhat questionable practices. It's too easy to hot-patch a live system during firefighting and then forget to retrofit the fix to the source repo. I _know_ that one of the outages in the previous job was caused by missing retrofit patch, post deployment.
The running joke is that there have been some Ericsson switches that could not be power cycled because their only correct state was the one running the network, after dozens of live hot patches over time had accumulated that had not been correctly committed to the repository.