Hacker News new | ask | show | jobs
by emluque 3632 days ago
You seem to be familiar with Elixir. May I ask you some questions?

. What kind of web applications are you building with it? I'm asking what kind of web apps or scenarios do you think Elixir is particularly well suited for?

I saw a thread on Elixir a couple of days ago and it piqued my interest and I saw a couple of videos that were posted there, one from some Ruby Conf that claimed that Elixir was giving better results (in request time) than rails. He never explained how that was posible or what would have been the results if he would have been using a cache (it's always faster to hit an in memory cache than hitting a db that has to touch disks, so if he speeded things up without a cache he would speed things even more with a cache). Then I watched another videos from some conference in Oslo or something, and from what I could understand he was doing away with the db completely.

. So I have another question, how do you architect your application in Elixir to keep application state though multiple requests (sessions) on multiple boxes without using something like memcached or redis (or a network disk)?

. Even if it's running on only one Box? Where does data reside if you are not using a db?

I have a basic understanding of Erlang processes (what's explained here: http://stackoverflow.com/questions/2708033/technically-why-a...) and how it's particularly well suited for concurrency. My questions are about Web Apps and Elixir and scaling.

3 comments

This is a benchmark I did with the versions of Rails and Phoenix that were current in October 2015.

select * from visits, plus conversion to JSON and delivery to a client on local loop. About 5000 records.

* Phoenix 140 ms

* Rails 248 ms

* Ruby without AR 219 ms

* PostgreSQL 2.97 ms, with no JSON generation and no delivery

select started_at, duration from visits -- JSON and delivery

* Phoenix 74 ms

* Rails 116 ms

* Ruby without AR 88 ms

* PostgreSQL 3.47 ms, no JSON no delivery

Single process, so maybe Phonix could get a larger advantage as the number of processes/requests increase. For the typical none to low traffic site there is little difference, the tool the programmer is more familiar with wins.

Edit: improved formatting.

Thank you for your reply. I can't edit my comment any more but where it says:

>Ruby Conf that claimed that Elixir was giving better results (in request time) than rails

I meant:

Ruby Conf that claimed that Elixir was giving better results (in request time) without a cache than rails with a cache

IIRC, Rails app mentioned in the talk was very old. Not a fair comparison, IMHO.
A couple of answers : 1) anything that is not just a PoC with limited amount of people connecting to it. For 10 clients connected at all, sure use the thing you know well. It will be easier to use and write, and your time to market will be better. Otherwise, use the real tool.

2) db is rarely your problem. Interpretation can be, and tons of other things. Also erlang schedules things far better and use all the core of the machine, so it will block far less, etc It also means that the thourgput and latency will stay stable despite the amount of users growing. (up to a limit, but it will far higher)

3) Store it in memory. And link other node to yours. Tadaaa

4) Where data live depends of your use case but why use a db if you do not need it anyway?

The thing in general is that web framework tend to use db to hide concurrency.

Also why no cache : because cache is complex, cost you another dependency and another program, make debugging harder, make your app less deterministic, etc. If you need cache use it. But caching is a really hard thing.

> You seem to be familiar with Elixir. May I ask you some questions?

Sure, np. I am mainly an Erlang developer but the same properties apply to Elixir, because both share the same battle tested VM -- BEAM.

> I'm asking what kind of web apps or scenarios do you think Elixir is particularly well suited for?

I am not currently working on building web app directly. But in general Elixir/Erlang would be good for a scenario where they'd be multiple connected clients at the same time. Think maybe something like a car sharing service where map updates in realtime as cars move through the city. Lots of users chatting together, or playing a game. Maybe they are bidding on ads, or tagging posts with "likes" and so on.

But there might be a significant improvemtent even in the plain old request-goes-to-database scenario, simply because the VM (BEAM) is better equipped at handling multiple connected sockets, all streaming data in and out. For example, it knows how to take advantage of multiple CPU cores, it handles GC better and so on. Not sure how deeply you want this explanation to go, so will stop here.

But, yes, if it all goes to a single MySQL instance running in another datacenter, that might be the bottleneck so there might not be a speedup seen by using something else. So you have to measure. In that case maybe a cache in front of it will work just as well to speed things up.

> he was doing away with the db completely.

Haven't seen the video. But with Erlang can think of those processes as in-memory storage of state. Can spawn hundreds of thousands of them, and they can live as long as the node stays up. Then can also connect multiple BEAM VM instances on different machines, create a cluster and so can refer to these processes as if they are local, in a rather transparrent manner. Or maybe they mean they used Erlang's built-in database (Mnesia), that had some limitations, but recently in a new release (19.0) it has the ability to scale much better as it can handle a pluggable storage such as LevelDB for example. The advantage there is the database is integrated into the langauge. So queries are not in a different query language, via a driver, so some other process, but queries looks like list comprehensions and transactions are just function closures. That can simplify things immensly in some cases.

> how do you architect your application in Elixir to keep application state though multiple requests (sessions) on multiple boxes without using something like memcached or redis (or a network disk)?

Can create a cluster so can keep state on another node that doesn't go down? (as mentioned), or can still use a database (say Postgres), so it really depends. Hard to give a general advice here.

Thank you for your answer. It has been very enlightening.

Part of the problem I had with the videos had to do with the examples they were using (a shopping cart that stores data on process memory rather than an external db seems a little risky to me).

I will investigate Erlang Clusters. Again, thank you for your answer.

The idea with this is that you were only storing the state (shopping cart contents) up until the point that the user was ready to place the order (at which point it was written to disk). Prior to this point, state is stored in an in-memory database that is clustered among all of the running nodes. So yes, there is a risk that you could lose state, but only if all nodes died at the same time. Otherwise, the nodes jsut recover and re-synch the state.