| > Do people actually really log into running production systems and update code Yes, I've done it once in a while. Cases could be is to deploy a fix and the customer's system is up and running. If say it's something urgent that can't wait until it goes through the full deployment pipeline. Because hot code reloading works so well in Erlang it's not risky as doing it in Java for example. In fact upgrading by hot code reloading is also a common thing Erlang world. So there are cases where it is done routinely. It takes some preparation and so on: http://learnyousomeerlang.com/relups Another case is if you see an issue happening but don't have enough logging or tracing ability in that part of code. You can upgrade the code with an additional log statement or save extra info to a file for debugging. Then remove the patch. The alternative is to try to replicate that on a separate system which sometimes might not be easy - don't have the exact access pattern, exact data and other factor that that would duplicate the original environment. But you're right doing it haphazardly and just sprinkling hot patched code updates everywhere is a path to disaster. So it's possible to monitor and record these updates to them them visible and managed better. It's up to the team / organization to handle that. The bottom line don't do it routinely, but when you have it can really save the day. And it's something that many (most!) frameworks / runtimes / languages don't support as well as Erlang does. |
What many people may be interested to know about Erlang is that you can log in to a production system, start a new shell running a tracer that listens on a localhost TCP socket[1], and use the dbg module in the production VM to trace calls and messages (and more) between any functions in any processes - in the running production system - and send them to the tracer node.
Done judiciously, the overhead is negligable, and the benefits are great. You can zoom in on bugs in real time.
I find the syntax of dbg match specs to be ugly, but it has saved my bacon so often it is so worth it, and it doesn't get mentioned that much, even though I feel it is almost as much a superpower as hot code loading.
[1] You use the separate shell to avoid accidentally crashing the production VM; if you do something boneheaded in the port-based shell, you can kill it and the production VM will just stop sending trace data to the dead TCP socket.