Hacker News new | ask | show | jobs
by brightball 2555 days ago
It's a difference in run time purpose.

With a single code base capable of having millions of processes running as the norm, some handling direct client requests and others handling in-progress work, data storage, holding open connections for transfers, etc...you get the capability to deploy without disrupting ANY of that.

Most run times can't do anything close to that. Think about all of those X million websocket benchmarks...now think about being able to deploy without forcing all X million to try to reconnect at the same time.

And it can do this while all of the nodes are connected and communicating with each other as well as the outside world.

For standard issue client server, it's not that big of a deal. You just separate the web parts behind a load balancer.

For background workers, long lived connections, web sockets, video/audio streams, file transfers (CDN)...it's huge.

1 comments

For all that to work you will need:

- to understand exactly what your app is doing

- to understand exactly what Erlang/Elixir releases are doin

- to understand exactly how cod upgrade works

- to understand exactly or very damn well how to make the system handle those 1 million connections

- to understand exactly how to handle all the things you wrote about

And then, and only then will you be able to "think about being able to deploy without forcing all X million to try to reconnect at the same time."

There are no magic bullets.

Eh...it’s basically a 1 line command with distillery. Another for the rollback capability.

It’s pretty magical. There’s a reason people love it.

Certainly, don’t use it if you don’t need it...it introduces extra complexity...but if you do it’s really hard to beat.

Having to support two different versions of the Elixir service during the rollout period is risky...

What if something goes wrong during the rollout? For example, if you change the database schema or upgraded your database engine or changed your back end authentication approach, it can break the old code. Then how will you know whether it's a problem with the old code or new code if many nodes are running both?

Those are things you have to account for in any zero downtime deploy situation though. It’s mostly minor changes in how you roll out schema changes.
I was mostly listing things you need for Erlang. Elixir, thankfully, hides a lot of things away in much friendlier packages.

That said, even with Distillery, if something goes awry, you'll need to know the warts behind the magic :)